UNIVERSITY OF CALIFORNIA

SANTA CRUZ


PHYSICS EDUCATION RESEARCH:

SUMMATION and APPLICATION

A thesis submitted in partial satisfaction

of the requirements for the degree of


MASTER OF SCIENCE

in

PHYSICS

by

Michael Eric Burnside

September 2002



The Thesis of Michael Eric Burnside

is approved:


________________________

Professor Fred Kuttner, Chair


________________________

Professor Bruce Rosenblum


________________________

Professor Joshua Deutsch


_______________________________________

Frank Talamantes

Vice Provost and Dean of Graduate Studies










Copyright ã by

Michael Eric Burnside

2002

Table of Contents

List of Figures

viii



List of Tables

ix



Abstract

x



Dedication

xi



Acknowledgments

xii



Part I : The Paper

1



Chapter 01 : Introduction to the Paper

2



Chapter 02 : Why Teach Physics?

3



Chapter 03 : PER, What Is It?

5



Chapter 04 : Epistemology

6



Chapter 05 : Concrete Small Issues

10



Chapter 06 : Assessment Tools

13



Chapter 07 : Misconceptions

15



Chapter 08 : Interactive Engagment Methods

18



Chapter 09 : A Potpourri of Interactive Engagement Points

22



Chapter 10 : Teachers and TAs

26



Chapter 11 : Validation, Literature Problems, and other issues

29



Chapter 12 : Conclusion to the Paper

33



References

34



Part II : The Data Analysis

40



Chapter 13 : Introduction to the Data Analysis

41



Chapter 14 : FCI Table Modifications and Data

42



Chapter 15 : MPEX Data on CCC Physics 2B and CCC Physics 10

49



15.1 : MPEX, Template and Expansion

49



15.2 : Commentary on JM Data in Parallelism with Redish et al. Using ET1

54



15.3 : Commentary on FK Data in Parallelism with Redish et al. Using ET2

56



15.4 : Comparison and Contrast Between Groups JM and FK

59



Chapter 16 : MB Data and Concentration Analysis on CCC Physics 4C

63



16.1 : MB Paper Template and Expansion

63



16.2 : Concentration Analysis on Group CF Data

67



16.3 : Introduction and Discussion of Table Alpha

73



Chapter 17 : FCI Data on CCC Physics 4A

79



17.1 : Presentation of Group JZ's Raw FCI Data in Table A

79



17.2 : Table B, A New Format

83



17.3 : Table C, Misconception to Question

87



17.4 : Table D, Misconception to Large FCI Divisions

90



17.5 : Additional Insights

93



Chapter 18 : FCI Data on CCC Physics 11

98



18.1 : Presentation of Group PG's Raw FCI Data in Table M

98



18.2 : Table N, A New Format

101



18.3 : Analysis

104



Chapter 19 : Conclusion to the Data Analysis

109



Part III : The Thesis

110



Chapter 20 : Introduction to a Thesis

111



Chapter 21 : Assessment Tools

111



21.1 : Paper 01, Force Concept Inventory

111



21.2 : Paper 02, Average Normalized Gain; <g>

114



21.3 : Paper 03, Mechanics Diagnostic

118



21.4 : Paper 04, Mechanics Baseline

121



21.5 : Paper 05, Force and Motion Conceptual Evaluation

123



21.6 : Paper 06, Conceptual Survey in Electricity and Magnetism

125



21.7 : Paper 07, Termal Concept Evaluation

130



21.8 : Paper 08, Concentration Analysis

131



Chapter 22 : Interactive Engagement Methods that Retain the Lecture

135



22.1 : Paper 09, Interactive Lecture Demonstrations

136



22.2 : Paper 10, Audience Paced Feedback

137



22.3 : Paper 11, Peer Instruction

139



22.4 : Paper 12, Socratic Dialogue Inducing

141



22.5 : Paper 13, Inquiry Experiences

144



22.6 : Paper 14, Supervised Practice

147



Chapter 23 : Interactive Engagement Methods that Replace the Lecture

151



23.1 : Paper 15, Studio Physics

151



23.2 : Paper 16, Integrated Math, Physics, Engineering, and Chemistry

155



23.3 : Paper 17, History and Philosophy of Science

158



Chapter 24 : Interactive Engagement Methods Potpourri

163



24.1 : Paper 18, Communicating Physics Through Story

163



24.2 : Paper 19, Linking the Domains of Mechanics and Electromagnetism

165



24.3 : Paper 20, Physics Jeopardy Problems

170



24.4 : Paper 21, Promoting Conceptual Change

171



Chapter 25 : Interactive Engagement Methods Highlighting the Computer

175



25.1 : Paper 22, Physlets and Just-in-Time-Teaching

175



25.2 : Paper 23, Numerical Integration

177



25.3 : Paper 24, To Simulate or Not to Simulate?

178



25.4 : Paper 25, Online Homework

180



Chapter 26 : Interactive Engagement Curricula

183



26.1 : Paper 26, Physics Goes Practical

183



26.2 : Paper 27, Resource Letter: TE-1 Teaching Electronics

184



26.3 : Paper 28, Macroscopic Phenomena and Microscopic Processes

184



26.4 : Paper 29, Modernizing Introductory Physics

189



Chapter 27 : Epistemology

194



27.1 : Paper 30, Implications of Cognitive Studies

195



27.2 : Paper 31, More than Misconceptions: Multiple Perspectives

199



27.3 : Paper 32, Maryland Physics Expectations Survey

205



27.4 : Paper 33, Physics Students Learn by Rote

208



27.5 : Paper 34, Productive Learning Resources of Students

210



27.6 : Paper 35, Methods to Advance Epistemological Development

215



Chapter 28 : Student Misconceptions

219



28.1 : Paper 36, The Reasoning Behind the Words in Electrostatics

219



28.2 : Paper 37, Student Difficulties in Applying a Wave Model

221



28.3 : Paper 38, Student Difficulties with Rays

226



28.4 : Paper 39, The Relativity of Simultaneity and the Role of Reference Frames

227



28.5 : Paper 40, Student Understanding of the First Law of Thermodynamics

231



Chapter 29 : Women and Minorities

236



29.1 : Paper 41, Women's Responces

236



29.2 : Paper 42, Women in Physics: A Review

239



29.3 : Paper 43, Reform in a Multi-Native Language Environment

241



Chapter 30 : Awards for Excellence in Teaching

244



30.1 : Paper 44, F. Reif's Millikan Lecture 1994

244



30.2 : Paper 45, D. Goodsteins Oersted Medal Speech 1999

253



30.3 : Paper 46, A. Van Heuvelen's Millikan Lecture 1999

255



30.4 : Paper 47, L. McDermott's Oersted Medal Lecture 2001

260



Chapter 31 : Defence of PER

268



31.1 : Paper 48, Who Needs Physics Education Research!?

269



31.2 : Paper 49, How Do We Know If We Are Doing A Good Job?

272



31.3 : Paper 50, Impressions and Importance of PER

275



Chapter 32 : Conclusion to a Thesis

276



Bibliography

277








































































List of Figures

The following list of six figures is complete.

The odd numbering is a delebrate attempt at parralelism with the original papers

Figure 2(b) : A-D plot

51

Figure 1 : S-C plot

71

Figure Z

246

Figure Y

248

Figure A

262

Figure B1

264

Figure B2

264















List of Tables

The following list of twenty-one tables is complete.

The odd numbering is a delibrate attempt at parrallelism with the original papers

Table I: FCI Modified, Newtonian Concepts in the FCI

44

Table II: FCI Modified, A Taxonomy of Misconceptions ...

45

Table III: FCI Inventory Scores

46

Table V: FCI Modified Item Interpretation and Response Frequencies 1/2

47

Table V: FCI Modified Item Interpretation and Response Frequencies 2/2

48

Table IV: MPEX Percentages of students giving favorable / ...

50

Table ET1: MPEX Data on Group JM

52

Table ET2: MPEX Data on Group FK


53

Table I: MB Modified, Newtonian Concepts on the Mechanics Baseline

65

Table II: MB Modified, Scores on the Mechanics Basline

66

Table III: CA Modified, Concentration Analysis - Implications of Patterns

69

Table IV: Concentration Analysis for Group CF

70

Table Alpha: New Format MB Data for Group CF

72

Table A: Cleaned-Up Raw FCI Data for Group JZ

82

Table B: New Formate FCI Data for Group JZ 1/2

85

Table B: New Formate FCI Data for Group JZ 2/2

86

Table C: Misconception to Question Using Group JZ Data

89

Table D: Misconception to Large FCI Divisions Using Group JZ Data

92

Table M: Cleaned-Up Raw FCI Data for Group PG

100

Table N: New Formate FCI Data for Group PG 1/2

102

Table N: New Formate FCI Data for Group PG 2/2

103







ABSTRACT



PHYSICS EDUCATION RESEARCH: SUMMATION and APPLICATION

Michael Eric Burnside

ABSTRACT


This work summarizes sixty-two PER related articles and papers. Most of these articles and papers were published in either The American Journal of Physics or The Physics Teacher from 1992 through 2002. The FCI, MB and MPEX were given to various Cabrillo Community College physics classes, and the resulting data is analyzed. This work is presented in three parts. A short introductory overview paper. A medium length data section. And, a long individualistic bookreview style thesis.



Dedication



to my father, Don Burnside and my mother, Joyce Heigel.

-- I love you, thank you.



Acknowledgements


I wish to acknowledge everyone and the kitchen sink:

To Fred Kuttner -- it’s been a while since I was so complimented.

To Steve Orr -- it’s been a while since I received such kindness.

To my reading committee, thank you for your signatures.


To my typing committee, I needed you, thank you from the bottom of my heart.

To my editing committee, Thank you for looking at this with new eyes.

To the UCSC graduate students of 2002, thank you for your attendence at my orals.

To the UCSC undergrads, no graduate student ever had such wonderful guinea pigs.

To the UCSC professors, I enjoyed my classes and am glad you admitted me.

To Cabrillo Community College teachers and students, thank you very much for helping me.

To my sister -- I enjoyed and needed your visit.

And to the kitchen sink -- long may you work.












Part I : The Paper

Chapter 01 : Introduction to the Paper


Teaching is an honor, an obligation and a joy. You, Hestenes’ “Oblivious”, have both benefited from good teachers and been harmed by bad teachers. It is societies need and my hope that you choose to strive for excellence in your duty as a teacher. Physics Educational Research (PER) can help ease and shorten the path between where you are and where you want to be, as an effective, beneficial instructor. PER has many components and details. For the teacher, the fundamental benefit of PER is access to the knowledge of fellow teachers and researchers. PER battles the isolation of the solitary teacher. If you have knowledge to advance the art of teaching, share it. In turn, do not ignore that which others have offered you. Their years of sweat and toil are yours for the reading of a paper.

I say art of teaching. Yet one of the fundamental purposes of PER is to turn this art into a science. McDermott argues that PER is both an empirical science and a fitting subject for research by physics faculty. Hammer states that PER is not yet a science, even though that is the goal. Hestenes informs us that while inspired teaching may be a nontransferable gift, good teaching is an acquirable skill. So, art or science, the distinction is material only to certain mindsets. Other mindsets enjoy the act of trying to improve the physics knowledge and ability of others. To improve others, requires many skills on the part of a teacher. The first skill is the ability to understand and do physics oneself. Trivial as the statement seems, it is not ignorable. A case in point is a co-worker's mother who teaches Spanish in a Georgia high school. This is the mother’s first year; she did not speak, read or write Spanish at the beginning of the year. While I have not assessed her students' Spanish skills, I am incredulous that she was even hired. The second skill is a real understanding of who you’re teaching. While social skills and interactions will turn out to be important, even more critical is knowledge about your students’ current beliefs on physical phenomena. Researchers state that students will not believe and use the physics you provide them, until their previous conflicting beliefs are explicitly proven wrong. Thus, at absolute minimum, you must know the predominant beliefs held by your students, so that you know what to prove wrong. You are not filling a vacuum. The third skill is mastery of both the big and small arts of connecting a known physical belief to a known physical person. One example of a big art is constructivism. This is an educational theory whose basic premise is that students have to construct their own knowledge. Teachers cannot give knowledge to students; teachers can only create an environment in which students have an improved opportunity to succeed. Entire books have been written on this theory. One example of a small art is the definitions of “negatively charged” and “neutral” in the context of electromagnetism. Between five and twenty percent of students believe that these two definitions are synonymous. The students reason that “positive” means “yes” and “negative” means “no”; so, “negatively charged” means “no charge”. They also reason that “neutral,” means “not charged”. So it’s pretty obvious that something which has “no charge” is “not charged”. Ipso facto -- negatively charged materials are neutral. This little gem was in a Letter to the Editor of the American Journal of Physics (Am. J. Phys.). Its author points out that this gem was actually difficult to discover, as students can quote correct physics and still not be able to perform correct physics. This in turn raises the issues of Language, Student Interviews, and Interpretation; all of which I will leave to the body of the paper. Here in the Introduction, I simply want to welcome you to a world that can supply useful knowledge and insights, even if it invites you to raise an eyebrow from time to time. PER is young and still quite messy. Chemistry was born out of Alchemy, Astronomy out of Astrology. Biology is only now coming into its own, with DNA. Education is on this path with interesting work being done in brain research, possible in large extent to MRIs. Still, I side with Hammer, PER is not yet a science, but then neither are large chunks of the medical profession. If it’s beneficial, use it.


CHAPTER 02 : Why Teach Physics?


PER focuses on the student and how he learns. Thus, there is a lot of research on finding reoccurring prevalent student misconceptions and methods to replace these misconceptions with accepted physics beliefs. PER stretches rather further a field, in search for a general theoretical framework in which to place the experimentally discovered PER facts. PER even briefly touches on why teach physics in the first place, and that is where we’ll start our odyssey. There are basically two views on why to teach physics. One is job/business oriented. The other is a bit more idealistic.

Four authors speak to the reasons we should teach physics. McCullough argues that the under representation of women in physics is unfortunate because “to sustain our technological civilization, every one of our future workers must be prepared in science, engineering, and mathematics.”01 Heuvelen states that the educational “desired outcomes are raised by two sources: Bloom's Taxonomy and three recent workplace studies.” These studies are 1) Shaping the Future, by the National Science Foundation, 2) the ABET Engineering Criteria 2000, and 3) The American Institute of Physics Survey. Heuvelen’s basic goal is to prepare physics students for a workplace in which 82% of B.S. physics graduates have final careers in industry and government, and for a workplace in which physics knowledge is the least used skill listed in the survey.02 Redish broadens this, arguing that “society has a great need not only for a few technically trained people but for a large group of individuals who understand science.”03 Goodstein takes a more idealistic approach. He argues that we need a revaluation in how we do our jobs.” Goodstein states that we physicists understand “in a very large measure, how the world works”, and that “to live in ignorance of [this] understanding should be intolerable.” Goodstein believes that “the undergraduate physics major is the liberal arts education of the 21st century.” He provides some history, noting that educating the masses not just the elites started in the early 1900s with American compulsory attendance laws that Europeans of the age found “fantastic and ridiculous.” Goodstein also speaks against our current focus on producing Ph.D.s, primarily because exponential growth in the Ph.D. job market is gone forever.04 Why we teach physics is more than an academic issue. Why we teach, largely dictates to whom and to how many, we teach. These in turn strongly influence how and what we teach. We should next address -- What is PER?

CHAPTER 03 : PER, What Is It?


Three answers, provided by three of the more prominent researchers in the field, follow. McDermott states that PER is an “empirical applied science” and that it “should be conducted by science faculty within science departments.” PER is Physics Educational Research, and for McDermott:

Research on the learning and teaching of physics is essential for cumulative improvement in physics instruction. Pursuing this goal through systematic research is efficient and greatly increases the likelihood that innovations will be effective beyond a particular instructor or institutional setting. The perspective taken is that teaching is a science as well as an art. Research conducted by physicists who are actively engaged in teaching can be the key to setting high (yet realistic) standards, to helping students meet expectations, and to assessing the extent to which real learning takes place.05


Hestenes joins McDermott in an oblique way and adds a hint that PER isn’t always viewed positively:

PER is a credible discipline with a body of reliable empirical evidence, clarified research issues, and able researchers. It is a serious program that applies to our teaching the same scientific standards we use in physics research. Unfortunately most of our colleagues are oblivious and some who aren’t are contemptuous.06

Hammer does not believe that PER is a science but does believe that it has “developed compelling evidence to discredit traditional methods and convictions.” Hammer states that:

While PER is not yet an applied science it does provide perspectives that expand, refine and support instructors perceptions and judgments. PER helps expand the instructor’s concentration to include not only physics content but also how the students interact with that context.07

So, rather than play: let’s define “science”, let us agree that there is valuable material in this body of work, and go find it. Most teachers are like engineers. They’d like to know what are the common student misconceptions and what methods other teachers have used to replace these misconceptions with current accepted physics belief. Later, perhaps, an interest in what tools were used to find these misconceptions in the first place, or to find that they had been successfully replaced by “real physics” comes into focus. Swiftly following is an interest in determining just how valid these tools, upon which so much rests, are. Prior to all this, is epistemology. After all, it’s so much simpler just to tell a stranger the physics truth and rely upon his memory to replicate the truth you’ve thoughtfully gifted him with. Perhaps a bit of homework for practice to bring him up to speed and a few labs to make it real and we’re done. Other than convincing ourselves that the fellow has an adequate memory, what’s the problem? In other words, why do we even care that the student believed anything, prior to us telling him the truth? Much less, having to take the time and energy to find out what he believed? Individualized instruction in a 200+ student introductory calculus-based physics sequence is hardly conceivable much less implementable. So, before we even consider what the misconceptions are, or how we are supposed to address them in our teaching, let us try to address why we care about their existence.


CHAPTER 04 : Epistemology


At a yard sale this weekend, I saw a book on constructivism. I opened it up and noted that we are now “post-epistemological”. I closed the book. The owner was an elementary teacher getting her Master’s Degree in Education. I did not buy the book. What’s the point of this story? -- That the definition of epistemology is author-dependent. It is, at bottom, all the non-physics, imported-from-other-departments, ideas and theories. Epistemology mostly originates from Educational, Philosophy, and Psychology Departments and is included as a beginning paragraph in most PER published papers. Unfortunately these paragraphs manage to both repeat themselves and differ. The following short paragraphs are an attempt to hit the high points only.

We commence with the views of Reif, who among other distinctions is a Professor of both Physics and Psychology. In fairness to Reif, I would like to point out Paper 44 in Chapter 30 of Part III does a much better presentation of his views as they are verbose and example dependent. Reif’s central instructional goal is to help students acquire a “modest amount of basic knowledge which they can flexibly use.” Flexible utility is paramount because science is the ability to predict or explain diverse phenomena using a small amount of basic knowledge and because the learned knowledge must retain it’s usefulness in a complex and rapidly changing world. The cognitive abilities required to ensure scientific knowledge can be flexibly used include: interpretation, description, organization, analysis, construction of solutions, and checking these solutions. Unfortunately, the most common method of teaching problem solving is that of example and practice. This process is flawed to the point that Reif labels it as “unwise.” He goes on and offers a heuristic strategy that is far more effective.8

Next up is Heuvelen. He presents Bloom’s Taxonomy: Knowledge, Comprehension, Application, Analysis, Synthesis, and Evaluation. Heuvelen asserts that humans are pattern-recognition animals who try to match their new experiences to previous events. He refers to studies in Linguistics that emphasize: the need for referents, the requirement for multiple exposures, the benefits of multiple representations, the helpfulness of interactive simulations, and the critical importance of starting early. Brain research has shown how detrimental aging is to the development of new synapses and how ingrained old patterns become. Active learning is important because studies show that we remember only 3% of what we hear. Effective methods of inquiry are based on student observation and modeling of real phenomena.2

Kalman et al. has many references to learning theory and philosophy. The authors present Posner’s learning framework for conceptual change. Emphasis is placed on two points:

First students must know of problems with their personal scientific conceptions,usually via curriculum induced conceptual conflict. Second and of equal importance, the student must not compartmentalize his knowledge. People can hold contradictory beliefs. Replacement not simple assimilation is the teaching goal.

Kalman et al. state that there are two methods of problem solving -- Template and Paradigms. The key difference is that students who compartmentalize knowledge and apply different templates to different knowledge subsets lack the ability to apply principles garnered from a problem to an apparently different problem. Furthermore, even if problem solving methods change, knowledge acquisition methods are likely to remain compartmentalized unless critical thinking skills are developed. The authors conclude with the following observations. Students often hold views different or alternative to those that they will be taught in their courses. The students will not easily relinquish their original viewpoints because their viewpoints explain observations and required effort to construct. Conceptual change requires the student to critically examine their view of the world. For change to occur students must make value judgments, rate ideas, and accept or reject material based on standards. Producing change thus requires Evaluation, the highest ability in Bloom’s taxonomy.10

Bao and Redish in constructing a model of student knowledge, appeal to neuroscience, cognitive science, and education research. The agreed upon core elements are: (1) memory is associative, (2) cognitive responses are productive, and (3) cognitive responses are context dependent (inclusive of student’s state of mind). As this is insufficient, the authors also advance several structures proposed by researchers: (A) patterns of associations (neural nets), (B) primitives / facets, (C) schemas, (D) mental models, and (E) physical models. The authors define these terms in their paper. Some of the definitions are rather involved.11

Galili and Hazan present a full page of theoretical background on the structure of knowledge. They present Mach, Bruner, Piaget, di Sessa, and Minstrell. The first two of these argue that people are unable to grasp, remember, or manipulate a huge amount of complex content without knowledge of structure. Piaget is the founder of constructivist theory, which “pursues picturing human cognition by it’s elements related in schemata.” di Sessa argues for the existence of stable cognitive constructs spontaneously created in the form of fundamental self-explanatory patterns (p-prims). Minstrell proposes facets-of-knowledge as the means by which students understand particular physical settings. Given the great versatility of naive conceptions, instruction should aim at the essence, the learners schema. The authors also advocate presenting “unsuccessful” attempts at physics conceptual development that nevertheless helped to attain present scientific knowledge; these would show students a realistic picture of the complex transformation of knowledge from old to new.12

Hammer argues in a future paper, that "misconceptions" is a misnomer. Here he talks about epistemological beliefs, misconceptions, and inquiry practices:

Epistemological beliefs influence how students reason in a physics class. Student’s beliefs about the course and the knowledge and reasoning it will entail impact student actions. Some students believe that understanding in physics means being familiar with a collection of facts and formulas, that the formalism of physics is only loosely associated with everyday experiences, and learning physics means memorizing information supplied by the professor or textbook. Other students believe understanding physics means developing a sense of its underlying principles and coherence, that formalism does represent everyday experiences, and that learning physics is applying and modifying one’s own understanding. Thus in an extended student debate the instructor often faces the dilemma of dealing with physics context misconceptions and appropriate but nascent epistemological beliefs. Jumping in to “fix” a misconception can all too often reinforce the idea that all truth comes from the instructor; to not jump in risks a further hardening of a false belief.


Misconceptions are strongly held stable cognitive structures that differ from expert conceptions. Misconceptions fundamentally effect student understanding of science and must be overcome for students to achieve expert understanding. This is in contrast to the idea that students are simply ignorant. For the instructor to simply transfer information is ineffectual. Misconceptions must be overcome before expert opinion will be accepted by the student who finds his current misconception both reasonable and useful. This overcoming is based on a process of drawing out explicit statements of the misconception by the students, confuting the misconception with arguments and evidence, and then promoting new more appropriate conceptions.


Inquiry Practices stands the traditional view on it’s head arguing that social participation in the scientific community is a requirement to build individual knowledge and ability. … By this view, students and physicists participate in socially constructed situated practices, scientific knowledge and practices are collective constructs of the scientific community. Fundamentally, learning physics means becoming a member, and adopting the practices of the community of physicists. These practices can be quirky and arbitrary as seen from the outside. In the class discussion, Harry responded to Amelia’s concern that space had gas in it that would slow down a moving ball. His response “we’re talking about ideal space,” was meant and received as a joke with the participants laughing. Assuming ideal conditions is not natural or routine ... There was much discussion about whether assuming no friction on an earth bound surface also meant that gravity had to be turned off. And of course from a thermodynamic point of view, a physicist would have to agree. In Newtonian force problems this is dismissed under the rubric “ideal”. Under Inquiry Practices, instructor intervention would be establishing certain social practices rather than primarily directed at individual knowledge and abilities.07

Epistemology is threatening to overwhelm my paper. I will refer you to Part III. In Chapter 27, Paper 30, Redish holds forth about the need for a general framework noting that “collecting data into a wizard’s book of everything that happens is not science.” He speaks at length about four principles: 1) Building Patterns, the Construction Principle; 2) Building on a Mental Model, the Assimilation Principle; 3) Changing an Existing Mental Model, the Accommodation Principle; and 4) the Individuality Principle.03 Hammer returns in Paper 34 of Part III. His essential point is that mental phenomena are attributed to the action of many agents acting in parallel, sometimes coherently, sometimes not. This is contrasted to a misconception, which is a single nonconforming cognitive unit. He also raises the ideas of Anchoring Conceptions and Bridging Analogies.13 In paper 35, Elby asserts “Epistemological sophistication is valuable” for students. His paper “shows instructional practices and curricular elements explicitly intended to further epistemological development.”14 In paper 32, Redish et al. presents the Maryland Physics Expectation (MPEX) survey and some results. The MPEX probes student expectations about the process of learning physics and the structure of physics knowledge.

What students expect will happen in their physics course plays a critical role in how they respond to the course. Their expectations play a role in what the student pays attention to and what he chooses to ignore. It is a factor in the student’s selection of activity by which he will construct his own knowledge base.15

And finally, the authors of paper 43 remind us that: “A lifetime of experiences pushing boxes and riding in cars is not dismissed for the sake of a memorized equation, even if we tell students explicitly to do so.”16 The upcoming chapter is much more concrete than this one, but it still focuses on non-physics issues effecting the teaching of physics.


CHAPTER 05 : Concrete Small Issues


This chapter is a potpourri of small issues, some interesting, some merely important. Students seek efficiency -- the achievement of a satisfactory grade with the least possible effort -- at a severe unnoticed penalty on how much they learn.15 If students are satisfied they will work; however, if they become discouraged, students will engage in intellectual damage control and minimize their amount of work.17 The result of instruction is to substantially lower the effort a student perceives as necessary for success in physics.15 “Students ‘play the game’ and distort their behavior to enhance their grades at the cost of achieving deep understanding of physics.” This despite there actually being no correlation between the amount of distortion and grades.18 “Inquiring skills are needed in part because students do not have enough experience with everyday phenomena to tie the concrete experience to the scientific explanation.”19 Many students are not able to identify and evaluate different points of view. They merely repeat themselves when asked to explain or defend their statements.07 People have different attention spans and different abilities to discriminate between important and peripheral issues.22 Effective teaching, as measured by student learning (as distinct from enthusiasm), is “not tightly linked” to student evaluations of the teacher, the course, nor their own learning.05 The correlation of student opinion to “the real success of a course is dubious.”16 Student evaluations are the end result of a chaotic system where small initial perturbations can lead to widely divergent results.20 Qualitative data results highlight the critical importance of socialization in the classroom. “Field notes and video tape reveal quite apparently that the same students, in the same room, working in the same groups respond differently to different teachers.” Student-faculty interactions depend greatly on the personalities involved.21 Attitude, not intelligence nor mathematical competence, is the prime cause of greater achievement.23 Student-student socialization is central to the success of students. “No social barriers based on race or gender were found after careful scrutiny.”21 While “success in physics knows no gender or racial boundaries ... there are intrinsic differences in students that ... enable some to succeed with little effort and others to fail even after considerable effort.”20 We need to start teaching physics to children. To do so has three benefits: 1) the layer of prejudice to penetrate is thinner, 2) social pressures have not yet taught children that physics is understandable only by geniuses, and 3):

The most important reason is that one finds radically new physical ideas springing from the flexible minds of theoretical physicists at the commencement of their professional careers -- in their twenties. This flexibility is a hallmark of a young mind and is increasingly difficult to retain as age advances. Yet the young mind must already know enough physics to appreciate what the outstanding problems are. Gaining this essential background early widens the window of having both knowledge and flexibility. As age will close the window, leaving knowledge without flexibility, all too soon.25

The key to starting early is the realization that as a child, the physics student had already mastered a very abstract language and his ability to acquire new languages decreased with age. In fact there is a strong drop off in language acquisition ability at about ten years of age.02

At an early age, the focus should be on forming intellectual resources such as “closer means stronger: (conceptual), or “I see it” (epistemological). This formation may of necessity come prior to alignment, with early science education mostly being messing about. Science can not end there, but Messing About is a better beginning than Remember the Magic Word.13


Story telling is an effective method of teaching children, and one that dovetails nicely with educational theories in which children are labeled “concrete thinkers.” Mr. Tompkins is a children’s book written by George Gamow a founder of Big Bang cosmology. This 1965 book was an immediate best-seller, popular with both the public and the professional scientists.25

The priorities of any large physics department: Faculty research comes first, followed by Ph.D. students, then Masters students, upper-level undergraduate majors and courses, introductory courses for majors, introductory courses for other scientists and engineers, and finally lowest of the low and often entirely absent, physics for non-scientists. Non- students are entirely absent. This priority list must be stood on it’s head. A fundamental problem of our times is the scientific illiteracy of the general population. This illiteracy threatens the foundation of our industrialized and democratic society. Physics departments should make it a top priority that 50%-75% of all non-science undergraduates take a science-literary physics course oriented toward scientific methodology and the connections between physics and society.26

One example of this step being taken, is LiPreste’s appreciation course in modern physics. For a class useless to degree requirements, twenty community college students showed up out of pure interest. The texts were Issac Asimov’s “Atom” and Brian Greene’s “The Elegant Universe”.27 The Interactive Engagement teaching method known as HPS (Paper 17 in Part III) “exposes ideas and subjective perspectives in science which humanizes science education and makes science appealing to a wider variety of minds.”12 While aimed at a specific audience, The Physics Teacher has a wide range of papers that non-physicists would find interesting. For example, the May 2001 issue has an article on “Surf Physics” with a picture of Mickey Muñoz actually hanging ten. The issue also has an article on “Marloyes Harp and the Thumb Piano”, detailing 19th century musical instruments that rely on the longitudinal vibration of solid rods. It even has a cover photo taken here in Santa Cruz. The upcoming chapter discusses the PER tools used for assessment of both students and curricula.


CHAPTER 06 : Assessment Tools


The assessment tool of choice for both students and curricula in the PER literature is the FCI (Force Concept Inventory). The FCI is followed in prevalence by the MB (Mechanics Baseline Test), the FMCE (Force and Motion Conceptual Evaluation), the CSEM (Conceptual Survey in Electricity and Magnetism), and the TCE (Thermal Concept Evaluation). Student interviews are used for assessment, primarily by McDermott and by authors during their initial development of the above tests. Taped classroom activities are rare but do appear in the literature, notably by Hammer. The MD (Mechanics Diagnostic), <g> the average normalized gain, and concentration analysis will be briefly mentioned in this chapter. For more depth on these subjects the reader is referred to Chapter 21 in Part III.

The FCI assesses the students overall grasp of the Newtonian concept of force. The reason the FCI is not “just another physics test” is that the wrong choices are correlated to specific misconceptions. These misconceptions are important as they must be overcome and replaced by Newtonian thinking before the student is asked to continue in his physics education. The “errors” on the FCI are more informative than the correct choices. These "errors" are common sense misconceptions which are reasonable hypotheses grounded in everyday experience. The FCI can be used: (1) as a diagnostic tool, (2) for evaluating instruction, (3) as a placement exam for college advanced courses.23

The reform movement has used FCI data as compelling evidence that there are serious problems with physics instruction. The FCI is far from the only such evidence. There is huge PER literature on student misconceptions which support the same conclusion. Lillian McDermott has documented the huge gap between what teachers think they are teaching and what students are actually learning, by methods other than FCI use. … FCI questions deliberately avoid the technical, precise, unambiguous language of physics. Too often students respond to the form of the technical language rather than its meaning. For example in a survey 80% of students could state Newton’s 3rd Law even though only 15% fully understood it as measured by the FCI. Validation interviews confirm that Newtonian thinkers are able to resolve the consequent imprecision and ambiguities arising from the avoidance of the technical language. … To the extent students have not mastered the material in the FCI is to the extent they will systematically misinterpret what they hear and read in their physics courses; they will treat the technical language of physics as muddled jargon; and they will be forced to resort to rote methods of learning and problem solving.06

The FCI motivates change in curriculum because it is the tool used to judge curricula, not merely students.29 The FCI is seen by one author as contributing to a negative classroom climate by sending “subtle cues that science is not a woman’s field.”01 In examining large populations, student choice among FCI distractors “contains information as valuable as the grosser distraction between correct and incorrect that has been the focus of most research.”11 Hake constructs an average normalized gain, <g>, using a combination of pre-instruction and post-instruction FCI scores. He uses <g> to compare physics instruction methods and finds that “the present interactive engagement courses are, on average, more than twice as effective in building basic concepts as traditional courses.”30 The MB is a universal, basic, mechanics-concept, assessor of student understanding. There exists extensive data on post-instruction scores which allows for evaluation and comparison of instructional effectiveness. “The main intent of the MB is to assess qualitative understanding; although, it looks like a conventional quantitative test.” Unlike the FCI, the distractors are not “common sense alternatives” although they do include “typical student mistakes.” Problems that can be solved by plugging numbers into a formula were excluded. Formal training in mechanics is required, and so the MB is most often given as a post-test .31

The FMCE, CSEM, and TCE are all used less than the FCI and MB. The FMCE is a newer test intended to replace the decade (1992) old FCI, partly in response to concerns over the FCI test security.32 However, the FCI continues to predominate. The CSEM was intended to be the FCI of electromagnetism (The FCI addresses only the Newtonian concept of force). There are, however, rather substantial differences. First, the CSEM distractors are not explicitly matched to misconceptions, and second, the CSEM relies on domains (force, motion, energy) in addition to the one it is testing (electromagnetism).33 The TCE is almost the FCI of Thermodynamics. It matches misconception to problem number but not specific distractor choice (A, B, C, D or E). Oddly, TCE's paper doesn’t specify the correct answers to the TCE questions. As a side note, the MD is an old test and is no longer used as it was supplanted by the FCI.23 Its enduring claim for attention is that the MB was the last PER published test that incorporated an explicit math component in defining physics competency.34 Concentration analysis I’ll leave Part III, Paper 08, saving only this quote:

If a multiple-choice question is designed with ... naive mental models is distractions, then the distribution of student responses yields information on the students’ state. The student with a strong naive belief will pick multiple wrong answers that are based on that belief. Students who simply lack knowledge will choose distractors reflecting no unifying mental model.11

The best way to confirm, correct, or find out what students really think, is to conduct repeated, detailed, taped, and transcribed interviews with individual students.15 In our “first look” at student activities, it is very important “to consider them one at a time and to interview them in depth, giving them substantial opportunity for ‘thinking aloud’ and not giving them any guidance at all.” Very valuable initial studies in PER are done with small sample sizes; only later when seeking to determine the prevalence of a misconception, are large sample sizes and written tests appropriate.35 “The greatest insight is through interviews where the students give their reasons for their choices on the [FCI].” Interviews are time consuming and not continuously necessary as the determined misconceptions are universal.23 Interviews also provide non-physics insights such as: it’s most beneficial when inquiry sessions follow lecture36, and cooperative grouping does have negatives for some women students (“domineering partners, fears that their partners didn’t respect them, and feelings that their partners understood far more than they”).19 Papers illustrate incorrect reasoning with student quotes from interviews,37 or from video excerpts of real instruction.07


CHAPTER 07 : Misconceptions


Misconceptions come in two basic overlapping varieties, those that are fundamentally physics content in nature and those that are essentially language-use / word-definition in nature. In the first category, I will include the very few mathematical misconceptions spoken to in PER literature, although mathematics is usually ignored. In the second category, there also is a minor distinction involving NES (Native English Speakers) and ESL (English as a Second Language students) which I will touch on. Specific physics content misconceptions follow. While this list is extensive, it is by no means exhaustive; both Part III and the original works contain far more examples. A few specific papers with a large number of listed misconceptions are: Paper 01 with thirty distinct misconceptions about Newtonian force23, Paper 17 with forty-six facets-of-knowledge about vision and optics of which half are false beliefs12, and Paper 07 with thirty-five misconceptions about Thermal Physics.38 Their lists are not replicated here.

  1. Students do not understand the concept of tension. They often confuse it with weight.39


  1. Students totally confuse conservation of mechanical energy with conservation of energy. Further the conditions necessary for conservation of mechanical energy are not clear to most students.40


  1. Students believe that batteries are constant current sources and that current is “used up” in circuits.05 Students believe that charged objects must be a conductors. 47


  1. In special relativity, students associate the time of an event with the time at which an observer receives a signal from the event. Students regard the observer as dependent only on his personal sensory experiences. Students tend to treat observers at the same location as being in the same reference frame, independent of relative motion. Students tend to treat observers at rest relative to each other but at different locations as being in separate reference frames.37


  1. In electromagnetism, 1) students do not distinguish between conductors and insulators, 2) students confuse magnetic field effects with electrical field effects, 3) students associate constant velocity with constant force, 4) students can not deduce the direction of the electric field from a charge in a potential, 5) students do not see a collapsing loop as changing magnetic flux or a rotating loop as not changing that flux, and 6) students fail to believe Newton’s third law extends to electric and magnetic situations.33


  1. In thermodynamics, A) There is failure to recognize that work done on and work done by a system in a given process have the same absolute value. The students argue that depending on direction of motion, one object is “winning” i.e. doing more work than the other object. B) There is failure to recognize that work is path dependent in the general case. Far too many students (mis) generalize from their experiences with conservative forces in their mechanics courses, with several students explicitly stating that “work is independent of the path taken.” Students are especially likely to neglect the path dependency of work in a cyclic process. C) There is failure to recognize that the sign of work is independent of the coordinate system. Students incorrectly speak of work as if it were a vector, and incorrectly tie back into mechanics where the sign of work is dependent on the relative directions of both the force and the displacement. D) Students use the concepts of heat and internal energy interchangeably -- not just the words. E) There is confusion even among instructors and textbook authors about heat, temperature, and internal energy. F) Students misinterpret the Ideal Gas Law often. G) Students treat heat as a substance residing in a body. And finally H) Students fail to distinguish between state and process quantities.35


  1. In optics, students have great difficulty connecting the ray and wave models of light.43 Both introductory and advanced students do not have a functional understanding of either ray or wave model of light ... serious difficulties include path length difference and phase difference.44 Extending from light to electrons, students are unable to interpret diffraction and interference in terms of a basic wave model. Students failed to see that the de Broglie wavelength is a function of momentum. Students misused v = lf and lE = h c formulas. 45


  1. In mathematics, students are daunted by reasonably rapid quantitative arguments, lack facility with scientific notation, do not follow simple ratiometric reasoning, are confused by scaling arguments, are intimidated by causal use of trigonometry, and while reasonably well schooled in calculus are clumsy and inefficient at simple algebra.17 There is also a general faulty understanding of multi-variable mathematical relationships. 48


Language issues play an important role in student misconceptions. Students tend not to realize that we physicists have redefined their sloppy English with mathematical rigor. Several language misconceptions follow, ended by a NES versus ESL issue.


  1. The basic everyday usage of the word "force" is as the cause for a change in position, with the greater "force" winning.49

  2. Student common sense beliefs are often metaphorical, vague and situation dependent. Language use facilitates vagueness by turning such words as force, energy, and power into synonyms. A disturbing observation is that five out of eight American graduate students who were given the FCI and then interviewed for a half hour each exhibited moderate to severe difficulty understanding English text. … two interesting problems were dropped from the original version of the FCI because they were misread more often than not. 23

  3. Confusion among work, heat, and internal energy is shown by a student who states: “no work is done in an isothermal process because the temperature does not change so no heat is transferred and thus no work can be done." ...[Thermodynamic] instructional context ... is distinguished from that of Mechanics by a thorough treatise on the critical prepositions to, on, and by as tied to heat and work.48

  4. Many students bring incorrect ideas with them to physics study. They also misinterpret what an instructor or a textbook is saying in a way that is reasonable but incorrect, particularly in previously unknown material. ... [An] example of a misinterpretation in the context of electrostatics is the statement “charge moves readily on a conductor but not on an insulator.” Students often replace “on” with “onto” in their thinking. This is consistent with the naive idea that insulators protect you and therefore cannot be charged. Prior to instruction 75% and after instruction 40% of the sampled population believe that a charged object must be a conductor. 47

  5. [A] general concept is characterized by its critical attributes. For example, quadrilateral has two critical attributes: “quad” which means four and “lateral” which means a non-curved side. Each concept has sub-concepts which are it’s examples. In this case we might use “square” and “rectangle”. The difficulty is examples have additional non-critical attributes such as the ninety degree angles in both the square and the rectangle. Students often can not distinguish between critical and non-critical attributes of a concept from the given examples.40

  6. The following addresses language issues from the lens of NES verses ESL. Most PER-based successful reform strategies require significant interactions among students. Given the premium on language and interpretation of context, it is reasonable to ask how well such reforms work in a multi-cultural, multi-native language environment as found at the City College of New York. The total class results are very similar to previously published results, with the traditional class having a Hake factor; <g> = 0.23 and the reform a <g> = 0.43. Within the traditional class, NES <g> = 0.26 and ESL students had a <g> = 0.21. Within the reform class, NES <g> = 0.46 and ESL <g> = 0.42. These results suggest native language is not a significant factor. Please note that all students spoke English in class.16 Also please note my comments in Part III Paper 43 .


The more obvious misconceptions are in Part III. Paper 01 includes "heavier objects fall faster". Paper 17 includes "vision not requiring the delivery of light or anything from an object into an observer's eye." And Paper 07 includes "Heat and Temperature are the same thing." Next up is a chapter on I. E. methods.


Chapter 08 : Interactive Engagement Methods


I.E. methods involve: change in the curricula, change in the role of teachers, and change in the actions of students. Hake’s paper, with its introduction of <g> and use of 6000 students, was instrumental in shaping the Interactive Engagement verses Traditional methods debate. In that paper, Hake defines Interactive Engagement methods as “methods … designed at least in part to promote conceptual understanding through interactive engagement of students in heads-on (always) and hands-on (usually) activities which yield immediate feedback through discussion with peers and / or instructors.” Hake defines Traditional methods as those which primarily rely “on passive-student lectures, recipe labs, and algorithmic-problem exams.”30 Change can be positively motivated. At Colgate University, the motives for reform were: 1) to improve student understanding of the basic concepts of physics, 2) to exploit modern technology, 3) to bring modern physics into the introductory syllabus, and 4) to increase student engagement.17 Change can be negatively motivated. Hestenes believes change is required because the traditional methods are failures:

... there is no evidence that students who attend lectures learn more than those who don’t. In fact, the complex cognitive skills required to understand physics cannot be developed by listening to lecturers any more than one can learn to play tennis by watching tennis matches. The tapes of Feynman’s Lectures on Physics are the pinnacle of classroom performance. They were expressly prepared for first year physics students at Cal Tech. Feynman himself regarded them as a failure, with only a small fraction of the students really able to cope with the course.06


Change can even be motivated by Educational Theory, and Workplace Studies:

Design is an upper-level Bloom’s Taxonomy educational objective, and one of the most frequent activities of physicist in the workplace. In response to this at Ohio State University in some labs, students design their own experiments to determine some property of a system.02


Why to change, is tied back into the earlier parts of this paper. What programs of change others have implemented follow. Chapters 22 through 26 in Part III provide significantly more detail; the following is intended as an overview.

At the curricula level, more modern physics, atomic, quantum, and relativity needs to be taught. Modern physics is significant and interesting. Quantum mechanics and relativity provide the inter-connection of the various physics domains.17 An early familiarity with modern physics attracts young people to physics. The prospect of studying modern physics is the most influential reason that students choose to study physics at the university level.25 Electronics needs to be taught. Physicists are called upon to modify or combine electronic lab equipment from overlapping generations, to work in electromagnetic noisy environments, and to compensate for minimal or nonexistent technical support.50 B.S. graduates with electronics knowledge are valued by business; paid internships are available to physics students with analog electronics knowledge, and offering such courses can be part of a broad local effort to attract small high tech companies to the local area.51 To find room in the curricula to include modern physics and electronics, items must be cut. Items must be cut also to compensate for the more time intensive requirements of Interactive Engagement methods. Fifteen percent of the old traditional curricula at Harvard University is not covered.52 Twenty-five percent of the traditional content was cut at Dickinson College, although electronics and nonlinear dynamics were added.19 In his high school class, Elby cut momentum, oscillatory motion, electronic circuits, magnetism, and all of modern physics.14 Colgate University’s reform effort cut the other way: “To add new material, something must be cut and in a traditional course that something is Newtonian Mechanics.” The reformers give several interesting reasons, dealt with in more detail in Part III Paper 29.17 There are review packages designed to supplement both traditional and reform instruction. MAOF is one such review package; its uniqueness is melding Mechanics and Electromagnetism via concepts common to both domains, such as conservative forces which are proportional to 1/r2 (Newton’s Gravitational Law and Coulomb’s Force).40 A far more widespread set of review packages is the “Tutorials in Introductory Physics” put out by McDermott at the University of Washington.

The Tutorials are a guided inquiry experience. A tutorial instructor guides students through the necessary reasoning by posing questions. The students work in groups of three or four. The Tutorials consist of pretest, worksheet, homework and post-test. They are not designed to transmit information or build standard problem solving. Rather they construct concepts, develop reasoning skills and relate the formalism of physics to the real world, thus developing a functional understanding in supplement to textbooks and lectures.05


The divisions within I.E. methods: curricula, teacher and student are not sharply delineated. The upcoming are teaching packages and names of reform attempts that mix all three, thoroughly.

  • PI, Peer Instruction, is Mazur’s idea on how to make lectures an Interactive Engagement method. PI is a structured questioning process that involves every student in the class. PI divides a class into a series of short presentations, each followed by a related conceptual question (ConceptTest). Students are given one or two minutes to formulate individual answers and report their answers to the instructor. If the percentage of correct students is between 30 and 70, which it normally is, then the students discuss their answers and the reasoning underlying them with each other. After this two to four minute discussion, the instructor polls students for their answers, explains the answer, and moves to the next topic. If the percentage is less than 30, additional instructor-centered teaching is needed, if the percentage is greater than 70 moving straight to the next subject is the most efficient use of time. ConceptTest questions are part of midterms and finals. 52


  • APF, Audience Paced Feedback, is the provision for each student in the lecture theater to have an electronic handset which allows each / all to answer simple binary questions from the lecturer. There are four question formulas possible (a) exploration, (b) verification, (c) interrogation, and (d) organization. This method is fundamentally different than the “raise your hands to answer” method because (1) all students are answering, (2) the lecturer can ask multiple choice questions, (3) the student replies are anonymous, (4) the lecture format shifts closer to that of a seminar, and (5) a permanent record is possible. APF allows the lecturer to ensure that the majority of the student body has understood the material before moving on. APF gives the students an active role in the lecture. During many questions, students are given time to discuss the problem thereby bringing an element of student-student teaching into the lecture. And finally, there is a small Hawthorne effect due to the unusual positive environment.53


  • ILD, Interactive Lecture Demonstrations, has lecture students interactively engaged with a lecturer performing demonstrations requiring the use of MBL, Microcomputer Based Laboratory equipment. Efforts to improve physics education while maintaining existing structures has resulted in ILDs. An ILD is an eight step procedure that engages students actively via individual predictions, small-scale discussion with nearest neighbors, and after the MBL measured demonstration, the completion of a results sheet. ILDs uniqueness is the real time data provided by the MBL tools.54


  • JiTT, Just in Time Teaching, is used to enforce Harvard University’s requirement that students read their assigned material prior to class.52 JiTT consists of short web-based assignments which are due a few hours before class. The instructor builds an interactive lecture session around students’ answers. Thus students take part in a guided discussion that begins with their own preliminary understanding of the material. To be truly effective, computer-assisted instruction must create a feedback loop between the instructor and student. Just in Time Teaching (JiTT) is one method to achieve this feedback.55


  • Physlets are multi-media-focused problems which differ fundamentally from traditional physics problems. In Physlets the text does not give numbers. Observing an animation on the computer screen, the student must find the minimum speed, observe the motion, apply physics concepts, and measure parameters that the student deems important, only then can they solve the problem. This requires the student to consider the problem qualitatively and prevents the “plug-and-chug” method of problem solving.55


  • Jeopardy Problems are problems where the student works backwards from a given mathematical equation to a diagrammatic, graphical, pictorial and/or word descriptions of a physical process. These are also Diagrammatic and Graphical Jeopardy Problems, where students invent a word or picture description and a math description consistent with the diagram or graph. Jeopardy problems allow us to help the student develop qualitative understanding while at the same time helping them learn to use the symbolic language of physics with understanding. Jeopardy Problems prevent the means-end analysis many novice students use to do end-of-chapter problems. They promote multiple representations where equations, diagrams and graphs, all become the same short story about life. Jeopardy Problems highlight units, which are the key to determining whether we’re dealing with pressures, densities, accelerations or distances. They are also easy to create. Because of their probable novelty, students will need practice before these problems are used on tests.56


  • During supervised small group practice, use of a single whiteboard per group [24” x 18”] allows for a common easily seen representation of the group’s work, facilitating inter-group communication. As it is both easily seen and is a common record, the whiteboard facilitates instructor feedback. It’s semi-permanence accommodates variations in students' attention spans. Small group problem solving is another approach that can be used to promote high quality practice among students. Timely feedback including correctness of solutions, is critically important for student learning. In supervised small group practice, a given student can get feedback from either the teacher and/or his several group members swiftly. Further a student does educationally gain by giving feedback. Finally a teacher can communicate with several people who share a common problem at the same time.22


  • SDI, Socratic Dialogue Inducing, labs are simple Newtonian experiments designed to produce conflict between a student’s common sense understanding and Newton’s Laws. This conflict induces collaborative discussion among lab partners and/or a Socratic dialogue with an instructor. SDI labs are relatively effective in guiding students to construct a coherent conceptual understanding of Newtonian mechanics. This is due to (a) the interactive engagement of students, (b) the Socratic method of instruction, (c) a degree of individualized instruction, (d) multiple representations of modeling physical systems, (e) kinesthetic sensations that intensify cognitive conflict, (f) cooperative grouping and (g) repeated exposure to the coherent Newtonian explanation in many different contexts.57 The Socratic method of instruction is also used in McDermott’s Tutorials where the instructor does not lecture but poses questions that help the student arrive at their own answers.05


  • HPS, history and philosophy of science, has three great strengths in teaching physics. First, it shows students a realistic picture of the transformation of knowledge from old beliefs to new. Second, HPS exposes competitive ideas and subjective perspectives which humanizes science and appeals to a wider variety of minds. Third, the historical conceptual difficulties overcome by scientists in the past are similar to those faced by the learners of today. The arguments employed by the great minds of the past can be used to help students learn physics.12 The Colgate reformers used the original published experiments of Boyle, Coulomb, Faraday, Millikan, Ulrey, Thompson, and Bohr.17 Güémez et al. used Black’s original experiments to advance their pedagogical arguments.58


There are many more Interactive Engagement issues that need to be addressed. Not all of them are in the upcoming potpourri chapter but most are.


Chapter 09 : A Potpourri of Interactive Engagement Points


There are quite a few points within I.E. which should be highlighted. The usefulness of microscopic models to explain macroscopic phenomena is debated. Thacker et al. argue that models of microscopic processes should be introduced as an integral part of any E&M course.59 Loverude et al. point out that many students in thermodynamics confirm their incorrect macroscopic arguments with reference to an incorrect microscopic model.35

Cooperative learning is widely used in I.E. methods. Cooperative learning promotes teamwork, one of the highest priorities the real world needs from education. Further, Heuvelen notes that Johnson found a grade point higher achievement in cooperative learning classes over traditional classes.02 Properly structured group work can be a particular blessing for young women; although the wrong structure can easily negate these benefits.01 The enforcement of roles (recorder, critic, etc.) is subject to much debate, with some instructors ignoring this aspect of cooperative learning.22 Other instructors explicitly teach cooperative team roles and protocols, reinforcing them via grading schemes.21 Cooperative groups and computers are two attempts to find cost-effective methods of providing timely guidance and feedback to students.08

There is debate about the effectiveness of computers / technology to enhance learning. There are those who argue: "running a computer simulation is very different than doing a physical experiment,"60 "technology by itself cannot improve instruction,"23 "to have a long-lasting impact on science education, [computer use] needs to be based on a successful pedagogy and not on the latest compilers, hardware, or algorithms." 55 Worse, "computer users seemed to rely on a more authoritarian view, that the right answer was something from the computer rather than something they had constructed themselves," 60 and "... computer use sometimes avoids error; [computer use] does not always confront [error] thus the error may persist in non-computer environments." 39 The other side is: "computer related tasks brought everyone back [to work], in this variant of peer instruction, students became involved as they had to tell the technology to do something for them." 21 Computers are "particularly useful in answering `what if' questions."02 In this day of powerful, inexpensive PCs and versatile software packages we should no longer restrict ourselves to only problems with closed form solutions, but should "perform Numerical Integration on both closed form (for comparison) and non-closed form (for realism) problems."61 "Today's software is platform-independent being based on virtual machines, meta languages and open Internet Standards, and thus not subject to obsolescence on [the historical eighteen month cycle]."55 Finally, computerized grading of homework is equally as good as conscientious, time-consuming, hand grading by a graduate student.60

Homework is viewed with some askance in the reform movement. Hestenes: "Practice makes permanent and mindless plug-and-chug without concepts is counter productive, not perfect."06 Reif: "Individual tutoring by the instructor is much more effective than homework, which too often merely perpetuates bad habits (haphazard use of formula being one)."08 Steinberg et al.: "Homework often fails to build a conceptual framework. It is often reduced to finding formula with the right combination of symbols or finding a worked out similar problem."16 Elby has an interesting page detailing his homework philosophy. His philosophy hinges on grading effort, not correctness and in handing out a partial solution set with the assignment. Elby argues that traditional students view homework as grade getting rather than learning. His unique homework methodology, combined with in-class frequent mini-quizzes, was explicitly designed to push students toward the realization that thinking through problems is the best way to learn physics, versus copying each other or a book.14

I.E. methods for the most part still use textbooks, homework, and labs. However, due to the time intensive nature of in-class activities, students are required to do more outside of class. For instance, Beichner et al. assigned students responsibility to read their textbooks outside of class in order to free up time for in-class Socratic dialogues.21 Johnson required students to attempt a subset of homework problems prior to each Supervised Practice meeting.22 Laws et al. require incomplete or incorrect labs to be completed or corrected during non-class, evening or weekend, open-lab hours.19

Not only are I.E. methods time intensive, they also often depend on small teacher-to-student ratios. Workshop labs have an instructor plus two undergraduate TAs per twenty-four students.19 Supervised Practice has two instructors drawn from graduate students, upper-class undergraduate majors, and volunteer post-docs per twenty-five students.22 SDI Labs require two Socratic dialogists, one of whom has had previous experience, for every twenty-four students.57

Many I.E. methods provide a room for students to immerse themselves in physics. The IMPEC group takes the prize with twenty-four hour access to their classroom.21 Indiana University has a "physics forum" open five to eight hours each day.30 Carnegie Mellon University allowed access to a drop-in center several hours each day.22

I.E. methods have many goals. Some are physics oriented, some are teaching oriented, and some are social. Among the many goals are: (1) to reduce the fragmentation of physics by its constituent domains, (2) to make physics instruction more concrete, and (3) to increase minority participation in physics. Students have a hard time organizing their knowledge around central characteristics. They often fail to distinguish between general concepts and their examples. This failure is exacerbated by courses which divide physics into separate fields i.e.: mechanics, E&M, optics, etc. An inter-domain organization of knowledge has several advantages. Among others, it allows a student to solve unfamiliar problems in one area of physics with tools from another.40 One tool that is used in mechanics, electrical circuits and quantum field theory is harmonic oscillation. Harmonic oscillation's multiple uses well justify the initial time spent on a mass and a spring.03 Inter-domain organization has its own pitfalls, for example "work is treated differently in mechanics and thermodynamics and this inconsistency may make it more difficult for students to transform a concept initially learned in one context to the other."35 Inter-domain organization is practical. Oregon State University's new junior level classes are Static Vector Fields, Oscillations, One Dimensional Waves, Quantum Measurement and Spin, Central Forces, Energy and Entropy, Periodic Systems, Rigid Bodies, and Reference Frames.13

I.E. methods seek to lessen the abstraction of physics. This is in part compensating for normal life50 in which less and less physical knowledge is required in everyday activities. For example, radio operators of fifty years ago had to know more electronics than radio operators of today; although the same is not true of radio designers. There are magnitudes more operators than designers, Illiterates versus Elites04 again. So in I.E. classes, students pitch baseballs, break pine boards with their bare hands, and ignite paper by compressing air.19 In tutorials, students use water tanks, dowel rods and sponges in a concrete method to understand the effects of a wave passing through a narrow slit.44 In working problems, vectors are color coded by type (displacement, velocity, acceleration, and force).64 The abstraction of physics is also reduced by various verbal strategies, such as classroom debate among student groups10 and old-fashion recitations.17 I should point out that the classroom debate is followed by a whole class vote which combats compartmentalization -- the ability of the human mind to hold two contradictory beliefs.10

I.E. methods make a conscious effort to get and keep minority participation in physics. Retention of minorities is one of the three major goals of the IMPEC program and that program achieved excellent results, passing 63% of female and 100% of minority students .21 The reform class at the City College of New York had a lower drop out rate than did the traditional class.16 At Dickinson College over 40% of the calculus-based Workshop Physics students are women.19 At Grinnel College, 50% of the physics majors are women.65 These stand in contrast to 19% -- the percentage of all physics bachelor's degrees going to women.01 Women do have one advantage, they benefit from inquiry-based laboratory exercises, men do not.36 Next up we'll address Teachers and TAs -- do they matter?


CHAPTER 10 : Teachers and TAs


PER is a bit schizoid when it comes to Teachers and TA's. Underlying this schizophrenia are two ideas. First, constructivism alters the role of teacher into one of a facilitator. Second, TA's are still students; students who's abilities both in physics and in teaching are suspect. A body of PER argues against the importance of the individual teacher as opposed to the curricula. Hestenes et al. in two separate papers, argues that "basic knowledge gain under conventional instruction is essentially independent of the professor."66 McDermott states that "student knowledge is practically instructor independent in equivalent physics courses." However, PER spends a lot of energy informing teachers on how to be effective, implying that poor teaching can be detrimental, even if curricula choice enforces an upper bound on the effectiveness of good teaching. Hestenes, in a third paper, argues that good teaching is an acquirable skill. "Technical knowledge about teaching and learning is as essential as subject content knowledge," this in part because "teachers with low FCI scores are unable to raise student scores above their own."06 While student evaluations and attitude are not measures of student learning, if one measures student opinion instead of student ability, teachers do matter. The IMPEC group found that “results depend strongly on who is teaching and how much previous experience this person has had with the course... it takes about three years for a professor to achieve highly satisfied students."17 "Students respond positively when many instructors use many different techniques in a fairly short time period," according to Holbrow et al.21 Aware of the large gap between small-scale studies and practical education delivery, Reif has written a textbook-workbook combination. Reif recommends distinguishing between wrong and nonsensical answers, providing a more severe penalty to the latter. He also recommends giving students frequent diagnostic tests.08 Hammer notes that it is important for teachers to have multiple perspectives on student knowledge.

Multiple perspectives are important because they increase the conceptual resources of the teacher. What instructors perceive depends critically on their conceptual resources and influences strongly how they think to intervene in student learning. What instructors notice informs their decisions as to whether to slow down the presentation, which topics to cover or problems to assign, and how to advise particular students. Conceptual resources include the instructor's knowledge of physics. The five based in PER are: 1) misconceptions, 2) p-prisms, 3) reasoning abilities, 4) epistemological beliefs, and 5) inquiry practices. As noted here and detailed at length at the end of this paper there is a difference between what is available in an instructor's head (conceptual resources) and publicly articulated perspectives. Learning to teach should involve developing the skills of gathering information, including skills of moderating discussions and interviewing students regarding their understanding. Further, forums need to be created for conversation among currently isolated instructors.07


Ehrlich speaks against "lowering the bar" for physics majors and for lower standards in "conceptual physics courses which promote greater scientific literacy among the general population." He argues that failing a class can both teach a great deal and be a long term good, even if a short term pain. Ehrlich also believes you're doing a good job teaching if you:

  • and your students respect each other

  • start each semester with enthusiasm

  • try and evaluate new teaching methods

  • keep up with advances in physics

  • encourage deep understanding

  • maintain high grading standards

  • take student evaluations seriously -- but not too seriously 20



Teachers operate under false assumptions that students differentiate between critical and non-critical attributes in an example or that students identify strong resemblances between examples.59 One reason few students develop a functional understanding of physics, is that professors view students as younger versions of themselves, where in reality professors were atypical students.05 Professor's instructional intuitions are as inadequate and incorrect as most student's physics intuitions and for the same reason -- extensive, unstudied, personal experience.07 Lecturing and teaching are such powerful learning experiences for the teachers that we may not want to give them up, even if other methods are better for our students.03 Worse, a well trained teacher is not enough. An individual teacher does not have the means or time to transform the teaching techniques of others. Long term district support, money, and follow-up are requirements for science reform success.67 One problem at the department level, is the research-oriented hiring and tenure focus. New faculty are never selected primarily for their teaching skills and have to invest their time and energy into research performance just to keep their jobs.26 Your department is doing a good job if it:

  • maintains high academic standards

  • encourages and welcomes all students

  • adds new options in the major as needed

  • tries different forms of instruction

  • offers students research opportunities

  • listens to its students

  • graduates some majors20


PER does not focus on teachers; still, there are a few research comments. Fourteen out of eighteen Arizona high school teachers got better than 80% on the FCI.23 Twenty teachers benefited from going through the MAOF program as "learners."40 Colgate instructors all [six] attend each lecture.17 Finally, in a rare example of instructor-instructor learning, the importance of providing ample wait time for your questions is highlighted.21

We will now shift our focus to the Teaching Assistants (TAs). Many PER papers note that part of their methodology is the explicit formal training of TAs. An exception to this notes: "There is currently no explicit training of teaching assistants. As a result, there is great variation in their effectiveness."29 Some programs use undergraduates as teaching assistants. Hake reports that the four I.E. courses with more than 200 students, all employed undergraduates to augment the instructional staff.30 Johnson notes that his program results in a slight increase in cost, mainly to pay undergraduate TAs.22 The explicit TA training is often weekly and could include faculty and technical personnel.48 Occasionally, TAs were required to attend lectures.52 TA training is important because graduate students are not experts in course material, pedagogy nor management of a collaborative learning environment. Training addresses instructor content knowledge, student difficulties with physics concepts, student difficulties with problem-solving processes, and student difficulties with peer interactions.22 McDermott prepares tutorial instructors in weekly seminars which are conducted "on the same material and in the same manner that the tutorial instructors are expected to teach." McDermott's tutorials are judged successful if the students' post-test matches or exceeds the tutorial instructor's pre-test,05 in one case success was 15%.44 Graduate students, independent of TA status, are used as guinea pigs in PER papers. In one study, only seven out of twenty-three students,37 in another study only two out of sixteen students,23 were able to successfully answer questions that the researchers viewed as basic.


CHAPTER 11 : Validation, Literature Problems, and other issues.


Validation varies extremely. One study uses factor analysis and the KR20 Reliability Test.33 Another boldly states that formal procedures to establish the validity and reliability “are unnecessary”, because of similarity to previously validated work. 23 Some authors use the FCI to "norm" their studies.10 Some use the FCI despite it's non-applicability (See Part III Paper 24).60 Validation is done by interviews.15 Validity is addressed by observing the results of Newtonian Thinkers (seven out of eight questions right) "on other tests, and on their written explanations to ... question 9)".46 Validation occasionally acknowledges the prevalent use of convenience sampling; one study mitigates this by student comparisons in GPA, gender, and major.36 Another study raises the validation issues of: 1) The varying amount of time courses spend on the tested subject; 2) teaching to the test; 3) test-question leakage due to the open-source nature of the FCI, MB, etc.; 4) the Hawthorne effect; and 5) the John Henry effect.30

The more complex problems with PER literature are elaborated in Part III. Here I merely seek to raise a few red flags. There are both subtle and blatant linguistic shifts. Hake asserts that "interactive engagement courses are, on average, more than twice as effective in building basic concepts as traditional courses." Yet his building block, the FCI, does not test basic concepts (plural) it only asses a single concept that of the "Newtonian concept of force."42 A small matter, until entire curricula are modified to enhance the Hake factor29 instead of modifying only the section on Newtonian force. Redish et al. obscures that the MPEX matches your students’ responses to the answers nineteen college and university teachers, who were first time implementers of Workshop Physics, would "prefer their students to give." This obscuring is done by multiple re-definitions. The nineteen college and university teachers implementing Workshop Physics in their classroom after attending a workshop at Dickinson College become Group 5. Group 5 was asked to respond with the answer they would prefer their students to give, which becomes the "preferred response of Group 5." The preferred response becomes the "favorable response." The favorable response becomes the "expert response." So nineteen workshop first time implementers become "experts" in epistemology, and the "response" is idealized to the point of being neither that of the actual teachers nor that of any real students. 15 The conclusions an author reaches do not always match those he has stated earlier in his paper due to linguistic slipperiness [Part III, Paper 28].59 Sometimes the linguistics is not slippery just thick, as for example a study that “tries, by means of elicited structural components of students' knowledge, to infer the influence of a historically oriented instruction in optics on the content conceptual knowledge of students in this science domain.”12

Many studies judge results on "correct reasoning," a few require "correct answers"; some use both.37 One study shifts between the two depending on which is most favorable.44 Reif notes a case where 45% of his students gave correct answers, although 70% reasoned correctly; he attributes the difference to minor algebra mistakes.08 In other cases, notably Yes, No, Explain problems, the reverse occurs with more correct answers than correct reasons.

There are authors who advocate a position despite there being "no overall improvement in gains."23 There are authors who speculate despite explicitly stating that their comments / beliefs are "impossible to verify" [See Part III Paper 21].10 There are studies based on one qualitative question.36 There are studies which provide questions and misconceptions but no match of distractor to specific survey item choice.38 There is at least one test published with questions which are known to the author to be defective and for which, in one case, no satisfactory replacement has been devised in over a decade.06

I.E. methods are not always implemented ideally. Sometimes, the issue is as basic as irregularities in light bulbs and inadequately charged batteries.36 Other times, the issues are more complex.

In the case of implementing ILD, less time was spent on analogies to similar situations and having the students discuss results than the ILD teacher notes suggest. Furthermore, students were allowed too much time to make predictions resulting in fast students loosing interest and focus. The small studio classroom with non-sloping floors made it difficult for all students to see the demos. And finally, the students are so collaboratively oriented that it was difficult to get individual initial predictions resulting in some students never making a personal intellectual commitment. In implementing CGPS there were deviations from the five-step process strategy taught. Some of these deviations were the result of Studio Physics structures common to all classes Experimental and Standard, for example no context-rich problems were included on exams. Only half the class periods were given over to CGPS with the other half focusing on standard problems which would be tested. Three additional issues were 1) due to time constraints the Instructor did not model CGPS techniques as often as desired; 2) students were strongly resistant to cooperative group roles and this aspect soon died out; and 3) CGPS problem solving strategy is typically irrelevant to textbook-style homework, thus students resented having to use a procedure when it was easier to solve the problem without it.29


Knowledge retention / extension is not commonly addressed; when it is, the results are usually poor. The PEG at the University of Washington has to design and implement whole series of tutorials precisely because students who have completed one tutorial still did not recognize it’s relevance to “new” situations.44 This series development finally terminates when students extend existing knowledge into new areas.45 The only other positive retention report that I've found is Thornton and Sokoloff's finding that after six weeks of no additional dynamics instruction, there was an approximate 6% increase in students responding in a Newtonian way.46 Otherwise, things are a bit bleak. Thacker et al. notes in the body of their paper an “important and discouraging finding.” Students after explicit instruction do not use microscopic mechanisms when confronted by “completely unfamiliar phenomena.”59 Beichner et al. notes that “no significant difference between previous IMPEC students and their traditionally taught peers was found” on standard exam performance in a following traditionally taught course.21 Finally, Marshal and Dorward inform us that “no cascading effect was noted. The only difference between inquiry and non-inquiry groups were those dealt with directly in the inquiry exercises.”36

The size of PER studies varies widely in both time frame and number of participants. Most studies last a semester or less. There are, however, exceptions. Peer Instruction has a ten year study,52 Colgate University reformers provide their experiences and lessons over an eight year period,17 and the MAOF teaching package was developed over a four year period.40 McDermott has been working on various PER issues for over twenty-five years.41 The number of participants tends to range from a class worth up to around fifteen classes worth, realizing that not all class sizes are equal. There is wide fluctuation. Hestenes has an FCI database of “more than 20,000 students and 300 physics classes.”06 The FCI was originally given to 1500 high school students and 500 university students.23 Hake made his average normalized gain, <g>, public with a 6542 student, 62 introductory-class study.30 Poulis et al. introduce Audience Paced Feedback (APF) in a 2600 student survey.53 Maloney et al. bring us the Conceptual Survey of Electricity and Magnetism (CSEM) in a 1500 student sample.33 McDermott weighs in with 1000 introductory physics students.05 Bao and Redish bring us Concentration Analysis with 778 students in fourteen classes. Loverude et al. perform a Thermodynamics study with 500 students.35 Interactive Lecture Demonstrations (ILD) is evaluated at the University of Oregon using 200 students.54 An E&M study used both ninety students at the University of Ohio and twenty-six students at the University of Michigan-Flint. Hammer’s Ph.D. involved the study of six students over the course of a semester.15 Some papers only provide us with the number of classes, three in the case of Steinberg.60 Some studies start small, seven graduate students and twenty advanced undergraduates, and end big, five years, fourteen instructors, 800 students and “various universities”.37 Although most studies consist of calculus-based introductory physics students, by no means do all. Some studies even involve only high school students.12 These studies are reported in multiple forums, thirty-nine papers in fourteen different publications are noted in the back of PER Am. J. Phys. Suppl. 68(7). The Studio Physics paper (#15 in Part III) is important because it is one of very few acknowledged and examined failures in PER. It's authors conclude that “it is necessary to mentally engage students; small classes, cooperative groups and computer availability are good but sadly, insufficient. Equally important are research based questions and activities.”29 I think this is enough for now except for a brief conclusion. Part III deals with additional issues such as: the hidden curriculum,24 knowledge structure,08 McDermott,05 and the whole pro-math,17 anti-math09 issue that usually shows up as a qualitative bias in the literature; the authors of which, believe compensates for a quantitative bias in Traditional instruction. Finally, we reach the conclusion.


CHAPTER 12 : Conclusion to the Paper


In conclusion, I hope this has whetted your appetite for more. Part III has significantly more information, after which I would recommend The American Journal of Physics, PER Supplements that came out in July 1999, 2000, and 2001 because of their singular focus. A subscription for The Physics Teacher for the long haul is also a good idea. Teaching is a profession and, as a professional, society expects you to keep up with advances in your field. It is, within limits, like being a doctor or a lawyer; the new matters. As a professional, I hope you benefit from the work of your fellow professionals, contribute your own insight and knowledge to the public record, and enjoy yourself.


References


1. L. McCullough, “Women in Physics: A Review,” The Physics Teacher Vol. 40, 86-114 (2002)



2. A. Heuvelen, “Millikan Lecture 1999: The Workplace, Student Minds, and Physics Learning Systems,” Am. J. Phys. 69(11), 1139-1146 (2001)



3. E. Redish, “The Implications of Cognitive Studies for Teaching Physics,” Am. J. Phys. 62(6), 796-803 (1994)



4. D. Goodstein, “Now Boarding the Flight from Physics, David Goodstein’s Acceptance Speech for the 1999 Oersted Medal presented by the American Association of Physics Teachers, 11 January 1999,” Am. J. Phys. 67(3), 183-186 (1999)



5. L. McDermott, “Oersted Medal Lecture 2001: Physics Education Research – The Key to Student Learning,” Am. J. Phys. 69(11), 1127-1137 (2001)



6. D. Hestenes, “Who needs physics education research !?,” Am. J. Phys. 66(6), 465-467 (1998)



7. D. Hammer, “More than misconceptions: Multiple perspectives on student knowledge and reasoning, and an appropriate role for education research,” Am. J. Phys. 64(10), 1316-1325 (1996)



8. F. Reif, “Millikan Lecture 1994: Understanding and teaching important scientific thought processes,” Am. J. Phys. 63(1), 17-32 (1995)



9. P. Lindenfeld, “Format and content in introductory physics,” Am. J. Phys. 70(1), 12-13 (2002)



10. C. Kalman, S. Morris, C. Cottin, R. Gordon, “Promoting conceptual change using collaborative groups in quantitative gateway courses,” PER Am. J. Phys. Suppl. 67(7), S45-S51 (1999)



11. L. Bao and E. Redish, “Concentration analysis: A quantitative assessment of student states,” PER Am. J. Phys. Suppl. 69(7), S45-S53 (2001)



12. I. Galili and A. Hazan, “The Influence of an historically oriented course on students’ content knowledge in optics evaluated by means of facets-schemes analysis,” PER Am. J. Phys. Suppl. 68(7), S3-S15 (2000)



13. D. Hammer, “Student resources for Learning,” PER Am. J. Phys. Suppl. 68(7), S52-S59 (2000)



14. A. Elby, “Helping physics students learn how to learn,” PER Am. J. Phys. Suppl. 69(7), S54-S64(2001)



15. E. Redish, J. Saul, and R. Steinberg, “Student expectations in introductory physics,” Am. J. Phys. 66(3), 212-224 (1998)



16. R. Steinberg, and K. Donnelly, “PER-Based Reform at a Multicultural Institution,” The Physics Teacher Vol. 40, 108-114 (2002)



17. C. Holbrow, J. Amato, E. Galvez, and J. Lloyd, “Modernizing introductory physics,” Am. J. Phys. 63, 1078-1090 (1995)



18. A. Elby, “Another reason that physics students learn by rote,” PER Am. J. Phys. Suppl. 67(7), S52-S57 (1999)



19. P. Laws, P. Rosborough, and F. Poodry, Women's responses to an activity-based introductory physics program,” PER Am. J. Phys. Suppl. 67(7), S32-S37 (1995)



20. R. Ehrlich, “How do we know if we are doing a good job in physics teaching?,” Am. J. Phys. 70(1), 24-29 (2002)



21. R. Beichner, L. Bernold, E. Burniston, P. Dail, R. Felder, J. Gastineau, M. Gjertsen, and J. Risley, “Case study of the physics component of an integrated curriculum,” PER Am. J. Phys. Suppl. 67(7), S16-S24 (1999)



22. M. Johnson, “Facilitating high quality student practice in introductory physics,” PER Am. J. Phys. Suppl. 69(7), S2-S11 (2001)


23. D. Hestenes, M. Wells, and G. Swackhamer, “Force Concept Inventory,” The Physics Teacher Vol. 30, 141-157 (1992)



24. (a) E. Redish, J. Saul and R. Steinberg, “Student expectations in introductory physics,” Am. J. Phys. 66(3), 212-224 (1998); (b) D. Hammer, “More than misconceptions: Multiple perspectives on student knowledge and reasoning, and an appropriate role for education research,” Am. J. Phys. 64(10), 1316-1325 (1996)



25. R. Stannard, “Communicating physics through story,” Physics Education, 30-34 (2001)



26. A. Hobson, “Science Literacy and Departmental Priorities,” Am. J. Phys. 67(3), 177 (1999)



27. M. LiPreste, “A Comment on Teaching Modern Physics,” The Physics Teacher Vol. 39, 262 (2001)



28. The Physics Teacher Vol. 39 (2001)



29. K. Cummings, J. Marx, R. Thornton, D. Kuhl, “Evaluating innovations in studio physics,” PER Am. J. Phys. Suppl. 67(7), S38-S44 (1999)



30. R. R. Hake, “Interactive-engagement versus traditional methods: A six-thousand-student survey of mechanics test data for introductory physics courses”, Am. J. Phys. 66(1), S64-S74 (1998)


31. D. Hestenes and M. Wells, “A Mechanics Baseline Test”, The Physics Teacher Vol. 30, 159-166 (1992)


32. (a) R. Thornton and D. Sokoloff, “Assessing student learning of Newton’s laws: The Force and Motion Conceptual Evaluation and The Evaluation of Active Learning Laboratory and Lecture Curricula,” Am. J. Phys. 66(4), 338-352 (1998); (b) R. Hake, “Interactive-engagement versus traditional methods: A six-thousand-student survey of mechanics test data for introductory physics courses,” Am. J. Phys. 66(2), 64-74 (1998)



33. D. Maloney, T. O’Kuma, C. Hieggelke, A. Heuvelen, “Surveying students’ conceptual knowledge of electricity and magnetism,” PER Am. J. Phys. Suppl. 69(7), S12-S23 (2001)



34. I. Halloun and D. Hestenes, “The initial knowledge state of college physics students,” Am. J. Phys. 53(11), 1043-1055 (1985)



35. M. Loverude, C. Kautz, and P. Heron, “Student understanding of the first law of thermodynamics: Relating work to the adiabatic compression of an ideal gas,” Am. J. Phys. 70(2), 137-148 (2002)



36. J. Marshal and J. Dorward, “Inquiring experiences as a lecture supplement for preservice elementary teachers & general education students,” PER Am. J. Phys. Suppl. 68(7), S27-S37 (2000)



37. R. Scherr, P. Shaffer, and S. Vokos, “Student understanding of time in special relativity: Simultaneity and reference frames,” PER Am. J. Phys. Suppl. 69(7), S24-S35 (2001)



38. S. Yeo and M. Zadnik, “Introductory Thermal Concept Evaluation: Assessing Students’ Understanding,” The Physics Teacher Vol. 39, 496-504 (2001)



39. L. McDermott, "Millikan Lecture 1990: What we teach and what is learned -- Closing the gap," Am. J. Phys. 59(4), 301-315 (1991)



40. E. Bagno, B. Eylon, and U. Gamiel, “From fragmented knowledge to a knowledge structure: Linking the domains of mechanics and electromagnetism,” PER Am. J. Phys. Suppl. 68(7), S16-S26 (2000)



41. L. Kirkpatrick, “American Association of Physics Teachers 2001 Oersted Medalist: Lillian C. McDermott,” Am. J. Phys. 69(11), 1126 (2001)



42. (a) D. Hestenes, M. Wells, and G. Swackhamer, “Force Concept Inventory,” The Physics Teacher Vol. 30, 141-157 (1992); (b) R. R. Hake, “Interactive-engagement versus traditional methods: A six-thousand-student survey of mechanics test data for introductory physics courses”, Am. J. Phys. 66(1), 64-74 (1998)



43. P. Colin and L. Viennot, “Using two models in optics: students’ difficulties and suggestions for teaching,” PER Am. J. Phys. Suppl. 69(7), S36-S44 (2001)



44. K. Wosilait, P. Heron, P. Shaffer, L. McDermott, “Addressing student difficulties in applying a wave model to the interference and diffraction of light,” PER Am. J. Phys. Suppl. 67(7), S5-S15 (1999)



45. S. Vokos, P. Shaffer, B. Ambrose, L. McDermott, “Student understanding of the wave nature of matter: Diffraction and interference of particles,” PER Am. J. Phys. Suppl. 68(7), S42-S51 (2000)



46. R. Thornton and D. Sokoloff, “Assessing student learning of Newton’s laws: The Force and Motion Conceptual Evaluation and The Evaluation of Active Learning Laboratory and Lecture Curricula,” Am. J. Phys. 66(4), 338-352 (1998)



47. R. Harrington, “Discovering the reasoning behind the words: An example from electrostatics,” PER Am. J. Phys. Suppl. 67(7), S58-S59 (1999)



48. (a) P. Lindenfeld, “Format and content in introductory physics,” Am. J. Phys. 70(1), 12-13 (2002); (b) M. Johnson, “Facilitating high quality student practice in introductory physics,” PER Am. J. Phys. Suppl. 69(7), S2-S11 (2001)



49. (a) D. Styer, “The Word “Force”, Am. J. Phys. 69(6), 631-632 (2001); (b) E. Redish, “The Implications of Cognitive Studies for Teaching Physics,” Am. J. Phys. 62(6), 796-803 (1994)



50. D. Henry, “Resource Letter: TE-1: Teaching electronics,” Am. J. Phys. 70(1), 14-23 (2002)



51. T. Usher and P. Dixon, “Physics goes practical,” Am. J. Phys. 70(1), 30-36 (2002)



52. C. Crouch and E. Mazur, “Peer Instruction: Ten years of experience and results,” Am. J. Phys. 69(9), 970-977 (2001)



53. J. Poulis, C. Massen, E. Rubens, and M. Gilbert, “Physics lecturing with audience paced feedback,” Am. J. Phys. 66(5), 439-441 (1998)



54. D. Sokoloff and R. Thornton, “Using Interactive Lecture Demonstrations to Create an Active Learning Environment,” The Physics Teacher Vol. 35, 340-347 (1997)



55. W. Christian, “Educational Software and the Sisyphus Effect,” Computing in Science and Engineering May-June 1999, 13-15 (1999)



56. A. Heuvelen and D. Maloney, “Playing Physics Jeopardy,” Am. J. Phys. 67(3), 252-256 (1999)



57. R. Hake, “Socratic Pedagogy in the Introductory Physics Laboratory,” The Physics Teacher Vol. 30, 546-552 (1992)


58. G. Güémez, C. Fiolhais and M. Fiolhais, "Revisiting Black's Experiments on the Latent Heat of Water," The Physics Teacher Vol. 40, 26-31 (2002)



59. B. Thacker, U. Ganiel and D. Boys, “Macroscopic phenomena and microscopic processes: Student understanding of transients in direct current electric circuits,” PER Am. J. Phys. Suppl. 67(7), S25-S31 (1999)



60. R. Steinberg, “Computers in teaching science: To simulate or not to simulate?,” PER Am. J. Phys. Suppl. 68(7), S37-S41 (2000)



61. P. Assimakopoulos, “A Computer-Aided Introductory Course in Electricity and Magnetism,” Computing in Science and Engineering Nov/Dec 2000, 88-94 (2000)



62. S. Bonham, R. Beichner, and D. Deardorff, “Online Homework: Does it Make a Difference?,” The Physics Teacher 39, 293-296 (2001)



63. The American Physical Society, The Forum on Education, Spring / Summer 2000


64. (a) R. Hake, “Socratic Pedagogy in the Introductory Physics Laboratory,” The Physics Teacher Vol. 30, 546-552 (1992); (b) R. Hake, “Interactive-engagement versus traditional methods: A six-thousand-student survey of mechanics test data for introductory physics courses,” Am. J. Phys. 66(1), 64-74 (1998)



65. M. Schneider, Encouragement of Women Physics Majors at Grinnell College: A Case Study,” The Physics Teacher 39, 280-282 (2001)



66. (a) D. Hestenes, M. Wells and G. Swackhamer, “Force Concept Inventory,” The Physics Teacher Vol. 30, 141-157 (1992); (b) I. Halloun and D. Hestenes, “The initial knowledge state of college physics students,” Am. J. Phys. 53(11), 1043-1055 (1985)



67. J. Bower, “Scientists and Science Education Reform: Myths, Methods, and Madness,” http://www.nas.edu/rige/backg2a.htm 10 pages
















Part II : Data Analysis



Chapter 13 : Introduction to the Data Analysis


In Part II Data Analysis, I seek to determine the initial knowledge state of the students in five community college classes. I also seek a few common unifying mental models composed of a limited and patterned set of misconceptions. Finally, I make some comments on the inefficiencies of the FCI as a tool for seeking these proposed common unifying non-Newtonian mental models.

The five Cabrillo Community College physics classes were all assessed in the spring semester of 2002. Cabrillo Community College (CCC) Physics 2B (group JM) and CCC Physics Physics 10 (Group FK) were given the MPEX, and their data is presented in Chapter 15. Data is presented in two formats; one is in parallelism with the original paper, and the other is in a new format. CCC Physics 4C (Group CF) was given the MB and its data is presented in Chapter 16. This MB data is also used in a concentration analysis aka Bao and Redish. CCC Physics 4A (group JM) was given the FCI and its data is presented in Chapter 17. While using this data in an attempt to find the sought mental models, some test analysis of the FCI is performed. CCC physics 11 (Group PG) was given the FCI and its data is presented in Chapter 18. Chapter 14 is my attempt at obtaining self consistency between FCI Tables I, II and V of Hestenes, et al.

I sought the existence of a few common unifying mental models composed of a limited and patterned set of misconceptions. The literature in Part III raises the possibility that such exists. For example, Boa and Redish in Paper 08 note three accepted PER facts. First, there are a small number of research identified common mental models. Second, multiple-choice tests can be designed with these mental models as distractors -- one such test is the FCI. Third, a student with a strong naïve belief will pick multiple wrong answers that are based on this unifying mental model; the simply ignorant will chose distractors randomly. Leaving aside the muddying of mental model and misconception definitions, If a few unifying mental models composed of a limited and patterned set of misconceptions exist, then the teachers efficiency is notably increased as the target of his energy becomes both small and well defined.

Improvements in the the FCI would help teachers in their attempts to correlate several different student misconceptions into a student's non-Newtonian mental model. MCSR items are valuable resources in a twenty-nine question test and certain null distractors could be reassigned to other misconceptions to enhance misconception to misconception correlation.

Data is presented in a new format that enhances analysis of individual student beliefs. The tables I used differ from those in this part. The tables I used were hand generated on graph paper. The symmetric uniform cell sizes enhances the presented formats utility, as does being able to put the entire graph on one sheet of paper



Chapter 14 : FCI Table Modifications and Data



D. Hestenes, M. Wells and G. Swackhamer, "Force Concept Inventory," The Physics Teacher Vol. 30, 141-157 (1992)

In this chapter, I present the Cabrillo Community College data in the manner used by the original FCI paper. I reprint AVH data from that paper for comparison purposes, but do not reprint data from the remaining five groups. The focus of this paper is the initial knowledge state of the student. This paper does not seek to compare the different instructional strategies, nor does it seek to develop a teacher competence ranking. For this last reason, Table IV is omitted from this paper. For the above reasons, post-instructional data is unnecessary, and could have been counter productive to the development of this paper. Tables III & V reflect the lack of post-instructional data.

A reader who compares the structures of Tables I, II, and V between this paper and the original FCI paper will note some modifications. The minor change is the use of italics to note multiple appearances of a single inventory item in Tables I and II rather than the original paper's incomplete use of parentheses. The major change is the mutual consistency between Tables I and II on the one hand, and Table V on the other. I inserted eleven inventory items, and removed two from table II. I inserted nine table I, and two Table II, inventory items into Table V. I also removed one inventory item from table V. My primary goal in these alterations was to attain consistency between all three tables. My secondary goal was to enhance the possibility of detecting the hoped for internal connections in a student's alternative belief structure. The Table II changes are: 16A inserted at line I1; 6D inserted at I2; 8E and 10C at I3; 27B and 27D at I4; 4C and 10D at I5; 15D at Ob; 1D at G3; and 16C at G5. In table II, I also removed 12B from line AF1 and the question mark from R1. The Table V changes are: R1 inserted in the grid locations 29A and 29B; [0] inserted in grid locations 23D, 24E and 25B; [2] inserted in 7E; [4] inserted in 9D, 18B and 28C; [5F] inserted in 22D; and [5G] inserted in 18B. In Table V, I also removed [5S] from grid location 22D.

Unfortunately formatting issues encourage me to present the explanation for Table V here instead of on the table itself. The far left column indicates the question number of the Force Concept Inventory diagnostic test. Columns A, B, C, D, E are the multiple choice alternatives (items) for each question using codes in Table I & II. The correct (Newtonian) answer, expressed in Table I code, is enclosed in square brackets. The common sense alternative choices are classified by the Table II code. In each grid, the percent frequency distributions of the students answer (item) choice is given for the pretest (upper row) and the posttest (lower row). The groups from left to right are ASU PHY 105, Cabrillo Community College Phys 11, and Cabrillo Community College Phys 4A. All numbers are percentages, they may not add up to 100% per question per group because some students did not answer all twenty-nine questions. Post-instruction testing for both Cabrillo Community College groups was not performed.


Table I: FCI Modified, Newtonian Concepts

Inventory Item

0. Kinematics

Velocity discriminated from position 20E

Acceleration discriminated from velocity 21D

Constant acceleration entails parabolic orbit 23D, 24E

Constant acceleration entails changing speed 25B

Vector addition of velocities 7E

1. First Law

With no force 4B, 6B, 10B

With no force and velocity direction constant 26B

With no force and speed constant 8A, 27A

With canceling forces 18B, 28C

2. Second Law

Impulsive force 6B, 7E

Constant force implies constant acceleration 24E, 25B

3. Third Law

For Impulsive forces 2E, 11E

For continuous forces 13A, 14A

4. Superposition Principle

Vector sum 19B

Canceling forces 9D, 18B, 28C

5. Kinds of Force

5S. Solid contact

Passive 9D, 12B, 12D

Impulsive 15C

Friction opposes motion 29C

5F. Fluid contact

Air resistance 22D

Buoyant (air pressure) 12D

5G. Gravitation

Gravitation 5D, 9D, 12B, 12D, 17C, 18B, 22D

Acceleration independent of weight 1C, 3A

Parabolic trajectory 16B, 23


Table II: FCI Modified, A Taxonomy of Misconceptions Probed by the Inventory. Presence of the Misconception is suggested by Selection of the corresponding Inventory Item.

Inventory Item

Kinematics

K1. position-velocity undiscriminated 20B, 20C, 20D

K2. velocity-acceleration undiscriminated 20A, 21B, 21C

K3. nonvectorial velocity composition 7C

Impetus

I1. impetus supplied by "hit" 9B, 9C, 16A, 22B 22C, 22E, 29D

I2. loss/recovery of original impetus 4D, 6C, 6D, 6E, 24A, 26A, 26D, 26E

I3. impetus dissipation 5A, 5B, 5C, 8C, 8E, 10C, 16C, 16D, 23E, 27C, 27E, 29B

I4. gradual/delayed impetus building 6D, 8B, 8D, 24D, 27B, 27D, 29E

I5. circular impetus 4A, 4C, 4D, 10A, 10D

Active Force

AF1. only active agents exert force 11B, 13D, 14D, 15A, 15B, 18D, 22A

AF2. motion implies active force 29A

AF3. no motion implies no force 12E

AF4. velocity proportional to applied force 25A, 28A

AF5. acceleration implies increasing force 17B

AF6. force causes acceleration to terminal velocity 17A, 25D

AF7. active force wears out 25C, 25E

Action/Reaction Pairs

AR1. greater mass implies greater force 2A, 2D, 11D, 13B, 14B

AR2. most active agents produce greatest force 11D, 13C, 14C

Concatenation of Influences

CI1. largest force determines motion 18A, 18E, 19A

CI2. force compromise determines motion 4C, 10D, 16A, 19C, 19D, 23C, 24C

CI3. last force to act determines motion 6A, 7B, 24B, 26C

Other Influences on Motion

Cf. centrifugal force 4C, 4D, 4E, 10C, 10B, 10E

Ob. obstacles exert no force 2C, 9A, 9B, 12A, 13E, 14E, 15D

Resistance

R1. mass makes things stop 23A, 23B, 29A, 29B

R2. motion when force overcomes resistance 28B, 28D

R3. resistance opposes force/impetus 28E

Gravity

G1. air pressure-assisted gravity 9A, 12C, 17E, 18E

G2. gravity intrinsic to mass 5E, 9E, 17D

G3. heavier objects fall faster 1A, 1D, 3B, 3D

G4. gravity increases as objects fall 5B, 17B

G5. gravity acts after impetus wears down 5B, 16C, 16D, 23E





Table III: FCI Inventory Scores



Group Mean Inventory Pretest Number of Students


AVH 34% (14%) 116


PG 35% (14%) 33


JM 59% (22%) 43


Percentages in parenthesis are standard deviations assuming a normal distribution which only roughly approximates the data.







TABLE V




Chapter 15 : MPEX Data on CCC Physics 2B and CCC Physics 10



15.1 : MPEX, Template and Expansion


E. Redish, J. Saul, and R. Steinberg, "Student expectations in introductory physics," Am. J. Phys. 66(3), 212-224(1998)


In this chapter, I present Cabrillo Community College data in both the manner used in the original MPEX paper and in a new format that preserves question and student responses. I reprint Expert and TYC data from Redish et al. for comparison purposes but do not reprint data from the remaining three calibration groups nor the five additional institutions. The original MPEX paper is interested in the effect of instruction on student attitudes and so needs post-instructional data. This paper is interested in the initial knowledge state of students, and thus has no such need. I followed the original paper's procedure by collapsing a 5 response Likert system into a two response for analysis, although the analysis possible from the ET1 and ET2 would be enhanced by retaining the actual Likert numbers, as this would allow analysis of belief strength. Also, my previous educational class in rubric design and use, strongly advised use of even numbered system to prevent fence sitting (four or six choices instead of five).

Table IV and Fig. 2(b) are matched to the corresponding Redish et al. presentations. ET1 is a hierarchical table of student responses for the fifteen students of group JM. The least favorably responded-to question is at the top; the most favorably responded-to question is at the bottom. The student who most agrees with expert opinion is on the left; the student who least agrees with expert opinion is on the right. Blank squares are responses by the student which are in agreement with the expert opinion; actual non responses are blackened in squares. Letters A and D represent disagreement with expert opinion on that question. ET2 follows the same patterns as ET1 for the thirty-six students of group FK.


Table IV: MPEX

Percentages of students giving favorable / unfavorable responses on overall and clusters of the MPEX survey at the beginning (pre) and end (post) of the first unit of university physics.


Overall Independence Coherent Concept Reality Link Math Link Effort

Expert 87/06 93/03 85/12 89/06 93/03 92/04 85/04



TYC pre 55/22 41/29 50/21 30/42 69/16 58/17 80/08

TYC post 49/26 42/32 48/29 35/41 58/17 58/18 65/21



JM pre 67/33 50/50 75/25 69/31 88/12 53/47 61/39

JM post



FK pre 52/34 40/52 44/37 39/40 76/16 46/32 70/17

FK post



Expert is defined in the original MPEX paper. TYC is a Community College reported in the original paper. JM is Cabrillo Community College (CCC) Phys. 2B and FK is CCC Phys. 10














Table ET1













Table ET2


15.2 : Commentary on JM Data in Parallelism with Redish et al. Using ET1


The Independence cluster in Redish et al. specifically notes survey items #1 (SI#1) and #14 (SI#14). The expert group disagreed with SI#1 at 100% and disagreed with SI#14 at 84%. The JM group disagreed with SI#1 at only 36% and also disagreed with SI#14 at 36%. As ET1 shows, SI#1 and SI#14 were the two survey items on which the expert and JM groups were in least agreement. ET1 contains a wealth of information. Some examples follow. Student #10 (S10) is in 100% disagreement with experts on the Independence cluster as seen by highlighting SI#1, SI#14, SI#13, SI#27, SI#8, and SI#17. S6, S13 and S2 are in disagreement with the expert opinion 67% of the time, with an additional six students disagreeing 50% of the time. There is general correlation between Overall and Independence cluster student scores with S4 only disagreeing once and S10 disagreeing all six times. SI#17 is the only easy item in this cluster, with the other five survey items all in the upper third of ET1. In individual percentage terms, S7 and S1 had forty percent of their disagreement with experts in this cluster (2 our of 5 each).

The Coherence cluster in Redish et al. specifically notes survey items SI#21 and SI#29. The expert group disagreed with SI#21 at 85%; group JM disagreed at 93%. SI#29 is course-specific; it is dependent on the availability of formula sheets or books for exams. In mirror image to the Independence cluster, four of the five survey items (SI#16, SI#29, SI#15, and SI#21) are in the lower half of ET1 showing good correlation between expert and JM groups. Only SI#12 is in the upper third, showing strong disagreement. There is less correlation between Overall and Coherence cluster than was the case for the Independence cluster with S15 very much in disagreement with expert opinion (4 out of 5 not matching expert opinion), whereas S10 is in disagreement only 1 out of 5 times. Only two students (S15 and S14) disagree with expert opinion more than 50% of the time. Other than S15, S9 is the only other person above 20 percent of his individual percentage disagreement with experts in this cluster (2 out of 8).

The Concepts cluster in Redish et al. specifically notes SI#4 and SI#19. TYC is highlighted as having a pre-instruction percentage of 16 in agreement with expert opinion on SI#4. JM group is at 80%. SI#19 is distinguished from SI#4 only by the words "most crucial". The Concepts survey items (SI#19, SI#27, SI#4, SI#32, and SI#26) are spread out. SI#19 is quite difficult at 06/09, and SI#26 is one of the easiest at 14/01. It's rather amazing that while no student agreed with expert opinion completely, eleven out of fifteen students agreed at the 80% level. Three students (S2, S6, and S10) strongly disagreed with expert opinion. Twenty-five percent of S6's disagreement with expert opinion occurred in this cluster (4 out of 16).

The Reality Link cluster in Redish et al. is agreed upon at the 93% level by experts. JM group comes close, agreeing at the 88% level; half of all disagreement with expert opinion came in SI#22. Even so, all survey items are in the lower half of ET1, with two of the four reality link items being amongst the easiest at 14/01. It is interesting to note that what disagreement exists is dispersed amongst the students.

The Math Link cluster in Redish et al. makes note of SI#2. Group UMN had a 20/48 response in percentage form. In percentage form, group JM had a 33/67 response to SI#2. Within one place, all survey items of the Math Link cluster are in the upper half of ET1, with both SI#2 and SI#6 among the most difficult at 05/10. Four students (S3, S13, S14, and S10) agree with experts only 20% of the time. S4 and S7 agree with experts only 50% and 60% of the time respectively. Math Link is fully 50% of all disagreements S4 has with expert opinion, for S7 it is 40% of the same. One student, S1, is in absolute agreement with expert opinion and three others (S9, S5, and S12) are in excellent (80%) agreement.

The Effort cluster in Redish et al. focuses primarily on the sever lowering of favorable response due to instruction. It does mention that experts are in agreement at the 85% level. JM group is in agreement a only at 61% level pre-instruction. The five survey items are quite spread out in student agreement with SI#6 only at 05/10 and SI#3 at a strong 13/02. Six students ( S3, S2, S13, S6, S14, and S10) agree with expert opinion only at the 50% level. Two students (S7 and S9) are in full agreement with expert opinion.

This method of arranging a table, highlighting certain rows of information (survey items for a specific cluster), and looking for correlations along a student's column, helps find outliers which can otherwise be overlooked. S10 has complete (6/6) disagreement with expert opinion on the Independence cluster. S15 has a high (4/5) disagreement with expert opinion on the Coherence cluster. S6 has a equally high (4/5) disagreement on the Concept cluster. These are examples of uniqueness which would be worth an instructors time to address individually. Conversely, S1 is in full agreement with expert opinion in the Math Link cluster, and thus, she is a candidate for being a peer tutor to someone like S3 who- while in severe disagreement (4/5) with expert opinion in the Math Link cluster- is in general agreement (23/11) overall.


15.3 : Commentary on FK data in Parallelism with Redish et al. Using ET2


The Independence cluster in Redish et al. specifically raises SI#1 and SI#14. The expert group disagreed with SI#1 at 100% and disagreed with SI#14 at 84%. The FK group disagreed with SI#1 at only 8% and disagreed with SI#14 at 17%. As ET2 shows, SI#1 was the item with the least agreement between the expert group and the FK group. SI#14 was fifth out of thirty-four, in non-congruence between the groups. ET2 contains a wealth of information. Some examples follow. S2 and S30 are in active disagreement with expert opinion, while S10 is not in agreement, his actual opinions are vague. Seven out of thirty-six students fail to agree with expert opinion 83% of the time. Of these seven, S3 and S16 have one third of their total disagreement with expert opinion in this cluster. In fact, only ten of thirty-six students agree with expert opinion more than 50% of the time in this cluster, with S12 and S22 the only two in complete agreement. This is particularly interesting in S22's case, as that student was only in agreement with expert opinion nineteen out of a possible thirty-four times. In personal percentage terms, S24 and S17 were the most impacted with approximately 40% of their total disagreement with expert opinion in this cluster. SI#17 is the only item in this cluster to be in the lower (agreeable) half of ET2. All after survey items (SI#1, SI#14, SI#27, SI#8, and SI#13) are in the upper (disagreeable) half. ET2's black squares are neutral or non-responses by survey takers. It's interesting to note that disagreeable SI#1 had no black squares, whereas agreeable SI#17 had two! The most black squares were six for SI #27.

The Coherence cluster in Redish et al. specifically notes SI#21 and SI#29. The expert group disagreed with SI#21 at 85%. Group FK disagreed at only 47%. SI#29 is course-specific; it is dependent on the availability of formula sheets or books for exams. Survey items are fairly spread out ranging from SI#12 at six agreement to SI#15 at twenty-seven agreement out of a possible thirty-six. There were significant black box impacts on SI#12 at eleven and SI#16 at ten. No student completely agreed with expert opinion in this cluster, although S36, S25, S14, S15, S23, S9, and S10, all only failed to agree because of neutral or no decision being made (black boxes). S4 is in complete and active disagreement with expert opinion in this cluster. S3, S10, and S6 fail to agree with the experts on any choice, primarily by passive black boxing. In personal percentage terms, S12 had 33% of her personal disagreement with expert opinion in this cluster and both S11 and S8 had approximately 25% of the same.

The Concepts cluster of Redish et al. specifically notes SI#4 and SI#19. TYC is highlighted as having a pre-instruction percentages of 16 in agreement with expert opinion on SI#4; FK group is at 11%. SI#19 is distinguished from SI#4 only by the words "most crucial". Given the similarity in these two items, it's interesting to note that no student agreed with expert opinion on both items. No student was in complete concordance with expert opinion, nor was any student in complete active disagreement. S34 chose not to respond to any of the five items in this cluster, which is not unique for this student as he chose to not respond twenty-two times. Similarly S22, S9, S10 and S6 chose to ignore the majority of items in this cluster. Student 22 may be worth interviewing as she was in overall agreement with the expert opinion (19 agreement out of 34 possible) and black boxed eleven responses, four of which were in this cluster (36%). S31 and S5 stand out by disagreeing with expert opinion 80% of the time. This is particularly interesting in S5's case as all of her active disagreements with expert opinion are in this cluster. Eleven out of thirty-six students agree with expert opinion more than 50% of the time, with only two of these in agreement at the 80% mark. SI#32 and SI#26 were noticeably more likely to have student-expert concordance (25 out of 36), and SI#4 and SI#9 were much more likely to have student-expert discordance (~5 out of 36).

The Reality Link cluster in Redish et al. is agreed on at the 93% level by experts. FK group agrees at the 76% level. The four survey items for the Reality Link (SI#22, SI#18, SI#10, and SI#25) are all in the most agreeable (lowest) third of all survey items. All most all active disagreement is by the right-hand most one third of students (17 out of 23), as is almost all black boxing (11 of 12). The only student who stands out from this pattern is S3 who agrees with the experts only 50% of the time in this cluster yet is in the left-hand half of ET2 with a 56% over all agreement with experts.

The Math Link cluster in Redish et al. specifically notes SI#2. Group UMN had a 20/48 response in percentage form. In percentage form, group FK had a 25/50 response to SI#2. Within two places, all the Math Link items are in the upper (disagreeable) half of ET2. The three students (S24, S11, and S17) who agree most often with expert opinion, totally agree with expert opinion in this cluster. While there is no absolute active disagreement, students (S34, S30, S6, and S21) fail to agree at all with expert opinion. S31, S1, S20, S15, and S19 only agree with expert opinion once in this cluster. In all, only fourteen out of thirty-six students agree with experts more than 50% of the time. Group FK was a conceptual physics class so Redish et al. comments in regards to high school physics courses in the Math Link cluster are relevant. An expert (H.S. teacher) level of 67% is more realistic than the expert (Group 5) level of 92% for comparison purposes.

The Effort cluster in the original paper focuses primarily on the severe lowering of favorable responses due to instruction. It does mention that expert agreement is at the 85% level. FK group is in agreement at the 70% level pre-instruction. The five survey items in the effort cluster are all in the lower (agreeable) two thirds of all possible survey items. It's interesting to note that both S14 and S16 agree with expert opinion nineteen times but have very different individual responses in this cluster. S14 agrees with expert opinion only once, whereas S16 agrees with expert opinion completely. It may be worth the instructor's time, to get these two students involved in a discussion on effort expectations is a physics course. Further, S33 is a useful peer tutor on this cluster because of her oddity, she is in less than 50% overall agreement with expert opinion and yet in complete 100% agreement with expert opinion in this cluster. It would be worth the instructor's consideration to use her as a peer tutor in this issue particularly with her near neighbors in overall agreement S20 and S2. Combating the negative 18% in Redish et al. requires effective focused effort and similar co-workers may have insight on this subject unavailable to the instructor.

As mentioned in the opening paragraph of this chapter, more information on the strength of students' views rather than their mere existence, could be obtained by retaining the actual Likert numbers rather than converting to a quasi-binary format of agree-disagree (black boxes). The allowance of neutrality [Likert response 3] is a flaw in the methodology. My educational classes for my secondary school math credential emphasized the need for rubrics to be even numbered, normally four, occasionally six, precisely to prevent the purely neutral response. This is to make people commit; a six point rubric (strongly agree, agree, slightly agree, slightly disagree, disagree, strongly disagree) would probably work best for the MPEX. The previous information was representative of what the MPEX can provide for a physics instructor in regards to the initial knowledge state of the student. Beyond the student, classes and specific categories of classes, have initial knowledge composites. While it behooves an instructor to be aware of possible the MPEX can provide some comparison and contrast between, in this case, the second class in a calculus based introductory sequence (group JM) and a stand-alone conceptual physics class (group FK).


15.4 : Comparison and Contrast Between Groups JM and FK


Beyond the individual, stands the initial knowledge composite of a class. At the real risk of discriminating, an instructor would still do well to prepare for the reality of the class he is to face. Only the roughest of comparisons is possible from Table IV in part 1 of this chapter. The following are some example comparisons between group JM a calculus-based, second course in a four course sequence and group FK a non-mathematical isolated course.

Independence survey items track each other, with SI#1, SI#14, SI#27, SI#8 and SI#17 in order from least in agreement with expert opinion to most. The exception is SI#13. Experts disagreed with the statement "My grade in this course is primarily determined by how familiar I am with the material. Insight or creativity has little to do with it." Only seven out of fifteen students in group JM disagreed with the statement and thus agreed with expert opinion at a 47% rate. Eighteen out of thirty-six students from group FK disagreed with this statement and thus agreed with expert opinion at the 50% rate. So what has happened is SI#13 stayed is place in percentage terms and all five other survey items shifted in respect to percentage around it. SI#1 the top (least in agreement with expert opinion) for both groups JM and FK, shifted in percentage terms from 20% for JM to 08% for FK. SI#14 from 20% to 17%. SI#27 from 53% to 36%. SI#8 from 60% to 42%. SI#17 from 87% to 86%. So even through in percentage terms SI#17 remained constant between groups, its relative position did not. It has four problems below it, and seven sharing its ranking, in group JM. Whereas group FK it has one problem below it and no shared ranking. Thus, while table formats used in ET1 and ET2 are very useful to show data within a class, additional information such as percentage lines might help make inter-class comparisons more reliable. So, SI#27 and SI#8 are the main differentiators between these two groups with both percentage and relative changes. More non-math students (group FK) than calculus students (group JM) believe, in opposition to expert opinion, that "understanding physics basically means being able to recall something you've read or been shown." More non-math students (group FK) calculus students (group JM) believe, in oppression to expert opinion, that "In this course, I do not expect to understand equations in an intuitive sense; they just have to be taken as givens."

Coherence survey items do not track well, as the only constant relative position is SI#12 which is at the top for both groups. SI#12 however has a large percentage shift from 40% agreement with expert opinion for group JM (calculus) to a 17% for group FK (non-math). Thus almost all of a non-math class (group FK) believes, in disagreement with expert opinion, that "knowledge in physics consists of many pieces of information each to which applies primarily to a specific situation." whereas a substantial portion of a calculus class does agree with expert opinion that the above quote is false. The shifts for the other survey items are, going from JM to FK by SI number: SI#16 from 73% to 53%. SI#29 from 80% to 31%, SI#15 from 87% to 75% and SI#21 from 93% to 47%. The largest percentage changes between classes in this cluster are SI#29 and SI#21. FK students are two to three times as likely as JM students to believe that it will be " a significant problem in this course [to be] able to memorize all the information I need to know." Worse, the same ratio applies to their relative beliefs that: "If I came up with two different approaches to a problem and they gave different answers, I would not worry about it; I would just choose the answer that seemed most reasonable!" This last is a statement accepted in defiance of expert opinion by more than half of the non-math class. Whereas only one person out of fifteen held this view in the calculus class.

Concept survey items do hold their relative position with the exception of survey item number four. SI#4 went from the middle of group JM, to the top of group FK. Its percentage shifted from 80% to 11%. Experts and group JM strongly disagree with the following statement; group FK just as strongly agrees that: "problem solving in physics basically means matching problems with facts or equations and then substituting values to get a number." Percentage shifts from JM to FK for the remaining items are: SI#19 from 40% to 14%, SI#27 from 53% to 36%, SI#32 from 80% to 69%, and SI#26 from 93% to 69%. Other than SI#4, the most significant percent shifts were in SI#19 and SI#26. Experts disagree with the upcoming statement as do 40% of calculus-based students (group JM) and 14% of non-math students (group FK): "The most crucial thing in solving a physics problem is finding the right equation to use." Experts agree with the upcoming statement as do 93% of group JM and 69% of group FK: " When I solve most exam or homework problems, I explicitly think about the concepts that underlie the problem."

Reality survey items do not hold their relative positions but are so close to one another that it doesn't matter. The percentage spread from top to bottom for group JM is 13 %, for group FK 12%. Group JM averages 12% above group FK. The only survey item to merit be worth individual attention is SI#18 which has both the biggest relative shift, last to second, and the largest percentage shift at 21%. Expert and group JM agree with the following statement as do about three-fourths of group FK: "To understand physics, I sometimes think about my personal experiences and relate them to the topic being analyzed." Apparently, one is four non-math students do not either 1)relate, 2) think, or 3) have relevant personal experiences. I suspect the third. Today in class roughly twenty students had no idea what the basic purpose of a pulley is, did not know how it actually works, and never used one before.

Math survey items hold their relative position excepting SI#6. In relative position SI#6 goes from second most in disagreement with expert opinion for group JM to most in agreement with expert opinion for group FK. In percentage shift, it goes from 33% for JM to 61% for FK; a reversal where the normal trend of JM agrees more with expert opinion than does FK. This is particularly noteworthy as JM is the math sophisticated group, and FK is not. Experts agree as do most of group FK with the following statement, group JM however disagrees strongly that: "I spend a lot of time figuring out and understanding at least some of the derivations or proofs given either in class or in the text." As several hypothetical reasons for this outcome come to mind, some individual student interviews from the majority opinion in group JM would be illuminating to me were I the instructor. The remaining survey item percentage shifts from JM to FK are: SI#2 from 33% to 25%, SI#8 from 60% to 42%, SI#20 from 67% to 47%, and SI#16 from 73% to 53%. SI#8, SI#20, and SI#16 all have a roughly 20% gap with, as normal, group JM in closer agreement to expert opinion. The smallest gap is for SI#2. Neither group has more than one out of three students agreeing with the expert opinion that the following statement should be disagreed with : "All I learn from a derivation or proof of a is valid and that it is OK to use it in problems."

Effort survey items hold their relative position excepting survey item #6. SI#6 drops right in between SI#7 and SI#24 assuming SI#24 on top for both groups although SI#7 and SI#24 are equals for group JM. In mirror image to the math cluster, the only survey item in which JM continues to agree more often than FK does with expert opinion is SI#24 going from 53% to 44%. Experts, 53% of group JM and 44% of group FK disagree with: "The results of an exam don't give me any useful guidance to improve my understanding of the course material. All the learning associated with an exam is in the studying I do before it takes place." For all other survey items of FM, the non-math novice students agreed more with the experts than do the students with a semesters worth of calculus-base physics under their belts, group JM. To keep the pattern, percent shifts still will go from JM to FK. SI#6 from 33% to 61%, SI#7 from 53% to 67%, SI#31 from 80% to 81%, and SI#3 from 87% to 97%. My choice of JM to FM pattern made earlier is now anti-intuitive; the more math knowledge and experience the less likely a student is to agree with expert opinion which matches the information in Redish et al. SI#6 is by far the most extreme example of this phenomenon, and was discussed and incorporated into the Math cluster above.

The MPEX can provide insight and information and to an instructor on the initial knowledge state of his students. Constructionist's argue that students learn, teaches only facilitate the process. Thus, it is as important to know who is learning, as it is, what they should be studying.



Chapter 16 : MB Data and Concentration Analysis on CCC Physics 4C



16.1 : MB Paper Template and Expansion


D. Hestenes and M. Wells, "Mechanics Baseline Test," The Physics Teacher Vol. 30, 159-166 (1992)

In this chapter, I present the Cabrillo Community College data in three separate formats. First, in the manner of the original MB paper. Second, in a Concentration Analysis patterned after Bao and Redish's paper. Third, a new format that preserves question and student responses. The original MB paper spends a large part of its effort correlating MB data to post-instruction FCI data. I do not perform this comparison as it does not advance my focus on the initial knowledge state of the student. While the MB is most often given as a post-instruction test, the authors allow that it can be used as a "pre-instruction placement exam" for "advanced university courses." Group CF is such a class; it is the Cabrillo Community College 4C class, the third class in a calculus-based introductory sequence for scientists and engineers. Only four questions are specifically addressed in Hestenes and Wells. Questions 4 and 5 are labeled as especially significant in that they reveal widespread deficiencies in the qualitative understanding of acceleration and questions 20 and 22 are probes of the conservation of energy and momentum which "present difficulties even to advanced students." Table I in the original paper associates questions to specific topics. Unlike the FCI, the distracters for the MB are not "commonsense alternatives" although they include "typical student mistakes." The Table I presented here has two very minor corrections and uses italics instead of parenthesis to show that a question involves more than one concept. The Table II presented here reprints AVH data and presents group CF's percentages

Table I: MB Modified, Newtonian Concepts on the Mechanics Baseline.

Each concept is involved in the corresponding question


Question

  1. Kinematics

Linear Motion

Constant acceleration 1, 2, 3

Average acceleration 18, 23

Average velocity 25

Integrated displacement 24

Curvilinear motion

Tangential acceleration 4

Normal acceleration 5, 8, 12

a = v2/ r 9, 12

B. General Principles

First Law 2

Second Law 3, 8, 9, 12, 18

Dependence on mass 17, 21

Third Law 12, 13, 14

Superposition Principle 5, 7, 13, 19

Work-Energy 20

Energy conservation 10, 11

Impulse-Momentum 16, 22

Momentum conservation 15

  1. Specific Forces

Gravitational free-fall 6, 26

Friction 9















Modified Table II



16.2 : Concentration Analysis on Group CF Data

L. Bao and E. Redish, "Concentration analysis: A qualitative assessment of student states," Am. J. Phys. Suppl. 69(7), S45-S53 (2001)

Concentration analysis measures the distribution of student responses to multiple-choice questions. This information provides insight into possible common incorrect models held by students and insight into whether a given question is effective in detecting student models. My main purpose in this section is to perform Bao and Redish's analysis so as to better understand their paper, which applies this analysis to the FCI. As spoken to at length in Part III, I do not agree with some of Bao and Redish's paper. I feel that while this method may be useful for researchers confronted by a high volume of data, the average teacher would benefit more from the method presented in part 3 of this chapter. Still, being able to perform concentration analysis is a valuable skill for a person with a PER focus and it does facilitate the on going dialogue in published work. Basically, we mathematically create a C, which paired with the student's score can be used as a point on a S-C graph or the S and the C can be binned to create a two letter label that is then matched to some "implications of the patterns." Bao and Redish also advance a third variable called gamma, which is C without the offset due to S; I do not use gamma. As implied, S and C are not independent and the S-C graph has boundaries. The original paper also does a graphical shift analysis on S-C graphs showing both pre and post instructional data from tutorial and traditional classes. As my focus is on the initial knowledge state of the student, I do not perform graphical shift analysis. While it is not my goal to repeat Bao and Redish's paper here, there are a few more relevant highlights. C is a number from zero to one that shows how concentrated student responses are to a question, independent of correctness of choice. Zero is random student response; each multiple-choice response (a b c d e) got picked by the same number of students. One is concentrated student response; all multiple choice responses are unpicked except one, which was picked by everybody. In the equation for C, "m" is the number of different choices available. For the Mechanics Baseline, m = 5 for twenty-three questions and m = 4 for the remaining three questions. "N" is the number of students; in the case of group CF, this equals 42. There exists a small problem in the case of a non-response by a student; this is not addressed by Bao and Redish. The options are: increase m by one in all cases, as all students do have the implicit option of choosing to not respond to a question, or adjust N to match the number of students who responded to that specific question. In solomaic fit, I did both and averaged the results. Given the gross binning advanced in table II of Bao and Redish, my choice affected only four of the twenty-six questions (Q22, Q02, Q03, and Q24). The boundaries of the S-C plots are no longer sharp, they vary slightly by question. The actual equation for C is equation 7 in Bao and Redish. The binning is in Table II of Bao and Redish or is marked on the axes of the S-C plot. A modified Table III is presented below that more accurately reflects the written commentary in Bao and Redish's paper. The score is percentage of students in class who got that question correct in decimal form. Table IV presents the values for both S and C. It also provides the matching two letter label. An S-C plot similar to Bao and Redish's fig 2(b) but using group CF's pre-instruction data is presented. Finally, prior to its discussion in part 3 of this chapter, Table Alpha the new data presentation format is provided.



Table III: CA Modified, Concentration Analysis - Implications of Patterns.

Combining score(s) and concentration factor(c), we can code the student response on a single question with a response pattern when using the three-level coding system in Table II

S C Implications of the Patterns


One model H H one correct model ~student doing well

M H [exists in Table IV]* ~student doing well

L H common incorrect student model

Two model H M (does not exist but bin cutoff dependent

M M two models, one of which is correct

L M two models, both of which are wrong

Non model H L (does not exist and is logically suspect)

M L [exists in Table IV]*

L L near random situation ~no models


        *Table IV of Bao and Redish, but no commentary.




CA Table IV




Fig 1




Table Alpha



16.3 : Introduction and Discussion of Table Alpha


Table Alpha at the end of this part presents group CF's data in a nice visual layout. Highlighting a subset of questions can bring some fairly detailed information to light as the following paragraphs make clear. The table can be worked both ways, information about students can be found by highlighting selected questions. It is possible to study questions by highlighting subsets of students, such as those with minority status. Partitioning the table is also of value, particularly in assigning students to cooperative groups. Given the focus of this paper on the initial knowledge state of the student, we forgo question analysis here

In the Calculation sub-group (Q12, Q18, Q9, Q11, Q22, Q20, and Q21), thirty-seven out of forty-two students got less than 50%. While only two students got all the questions wrong, nine got all but one wrong. Conversely no student got this sub-group completely correct and only one student got all right expect one. The five best students (S41, S34, S31, S20, and S35) in this sub-group, showed calculations on their test papers seventeen out of thirty-five possible opportunities. Five of the worst students (S22, S17, S32, S16, and S15) in this sub-group, showed calculations on their test papers three out of thirty-five possible opportunities. What written calculations that existed for the better group showed eleven correct answers out of the seventeen calculations. The remaining eighteen possible opportunities showed thirteen correct answers despite no work being shown. For the worst group, all three calculations resulted in wrong answers. Of the remaining thirty-two possible opportunities, three were correct, a percentage result worse than random guessing. The problem with examining only test papers is that I can't be certain that no scratch paper was used as the test was proctored by the class instructor and use of scratch paper was a subject we did not discuss. Question twelve was correctly answered twice. I believe that in both cases this was the result of a lucky guess; no work was shown in either cases and in both case the companion question number eleven was wrong. Table I shows that question 12 addresses four different Newtonian concepts. For those few students that showed their work, it seems that the final nail in the coffin was forgetting that the centripetal force is added to the person's weight. For example, S41 wrote F = mv2/r and substituted in F = 50 (20)/5 and circled "E", none of above, as his answer. This shows an understanding of normal acceleration, a = v2/r, and Newton's second law. It was arguably shows a lack of understanding the third law, but I suspect it's more a case of I've found "a" force so it most be "the" force, and not bothering to look around for any other forces to perform vector summation with. Weight is written down on only one of the forty-two papers. On the other hand, choice "B" is very close to the weight alone with no centripetal force added to it. "B" was chosen by thirteen out of forty-two students and the weight calculation is possible to do in ones head. Eight of these thirteen wrote nothing on their papers; half of all those who wrote nothing, and 3 more of these thirteen had only written a single formula with no numbers, for two of the three, the formula was F = mv2/r. Q12 was atypical in that twenty-five students wrote something down on paper for an answer attempt. Proportionally, S25 was most impacted by this calculation sub-group; six of his eleven non-correct responses were in this sub-group. Of these, four were non-attempts at the four most difficult of this sub-groups questions. Conversely, S35 shines with only two of his thirteen failures in this sub-group. Still the overall result of 34% passing is dismal. Each question can be so analyzed but that is not the purpose of this paper. Focusing on the initial knowledge state of the student, it is fair to say the following. First, the vast majority of students do not show calculations for those multiple choice questions that are designed to reward such explicit effort. Without individual interviews it's impossible to say for sure, but given the literature comment about students seeking efficiency even at substantial cost to their physics understanding, I would venture that multiple choice format tests encourage students to mentally do "just enough" to pick one of the given answers after which the student moves on to the next question without completing his work, much less checking it. Second, students did significantly better on those calculation sub-group problems that were both addressing only a single Newtonian concept and also were not also part of the diagram or kinematics sub-groups (Q11, Q22, Q20, and Q21). Third, out of seventy-four blanks, forty-two are in the calculation group; 58% of all blanks in just 27% of all questions (seven out of 26). While it is a good test taking strategy to skip the hard problems, one is suppose to come back to them at the end of the test, and guess if you have to, particularly if you can exclude some possible answers. While eighteen of all blanks look as if they are the result of time pressure, the remaining fifty-six blanks indicate poor test taking strategies on the part of fifteen students. Although student guessing means a bit more work for the researcher, students should know how to maximize their scores on multiple-choice tests. Such tests can have major impacts on their lives (think SAT, GRE, etc.) and explicitly teaching this skill should be part of physics' "hidden agenda".

Highlighting Q12, Q5, Q18, Q7, Q26, Q13, and Q19 brings the diagram sub-group to relief. Both the diagram and the calculation sub-groups have seven questions; one of the more stark differences between them is the comparative lack of blanks in the diagram sub-group. Twenty compared to the calculation groups forty-two. Unfortunately, the overall success rate is even worse, the diagram sub-group was correctly answered only 31% of the time. Diagram subgroup is those questions for which force diagrams would facilitate the solution according to Hestenes and Wells' original paper. The fundamental problem is students did not draw force diagrams on their test papers. The worst five students (all seven wrong) is this sub-group (S17, S38, S1, S37, and S7) drew two out of a possible thirty-five force diagrams. Five (S41, S30, S5, S34, and S29), of the best students (two or three wrong), drew one out of a possible thirty-five force diagrams. To add insult to injury all three diagrams were incorrect and resulted in incorrect answer choices. In a generally poor field, a couple of students manage to stand out. S7 failed seven out of his thirteen total failures in this sub-group. Conversely, S16 only failed three out of his fifteen total failures in this sub-group. While Q13 and Q15 managed a 50% pass rate, the three toughest questions (Q12, Q5, and Q18) are in this sub-group. Q5 has a very strong wrong distracter in its choice "E", half the class chose it, three times as many as picked the correct answer, "A". Student interviews would be interesting. I wonder if students are aware that in Q5 they are in a region of circular motion. I hypothesize that they see a block going down a hill, then back up. They realize that there is a "change" in the direction of motion and quickly guess (wrongly) that it is analogous to a ball being thrown up into the air and just say that the acceleration is zero momentarily in a false similarly to the ball going through a zero velocity at the top of its "arc". A student misguess facilitated by the incorrect arc we draw in the ball problem that is false, the ball actually goes straight up and down over the same path but that's not how we draw the ball problem on chalkboards. Further complicating the issue, only one of the seven students who got Q5 correct, also got a majority of the remaining easier diagram sub-group problems correct; that plus the lack of force diagrams does nothing to build confidence that all seven know why their choice is correct. On the other hand, there are no blanks for Q5, indicating a general perception that the question was answerable. To finish off the thought, eight people would get this question if the class engaged in for random guessing. S20, S31, S34, and particularly S41 have a majority of their errors in this sub-group and yet did comparatively well on the test as a whole. Telling them that drawing force diagrams on multiple choice tests is appropriate, may well be all they need, but the majority of the class should be explicitly assessed on whether they can construct force diagrams, not merely whether they can recognize an appropriate venue for their use.

The large kinematics sub-group (Q12, Q5, Q18, Q9, Q25, Q24, Q8, Q23, Q1, Q2, Q3, and Q4) show some improvement over the other sub-groups coming in at 40% correct over all, but this is still not up to the test as a whole's average of 47%. A few students did well to OK, notable S34, S41, S30, S24, and S33. Student 34 in particular, only got one wrong out of twelve questions and that was Q12. Still, many students showed a complete lack of kinematics understanding, with twelve students out of forty-two, getting 25% or less of these questions correct; a level close to random guessing (20%). S38 managed the amusing feat of getting them all wrong. In percentage terms, S9 out did him, ten out of S9's total of fourteen wrong answers were in this sub-group for a 71%; S38 only had 57% of his personal wrong choices in this sub-group. Some of the easier questions in this sub-group are Q1 and Q2. They refer to the same figure and while having a nice pass rate of twenty-eight out of forty-two still have seven students with the paired misconceptions of "A" for Q1 and "E" for Q2. This pairing is a graphical mirror image of the correct answer, and given one choice, the other is logically consistent. Student interviews might reveal this to be a graph reading error, rather than a physics misconception. Q4 is another of the easier questions and choice D might very well indicate students believe that gravity is the "only" force, forgetting the normal force and forgetting that acceleration is by definition in the direction of the change in velocity. Improvement on tests is as much a matter of making sure a student gets the easy ones as it is a matter of tackling the difficult issues. Still the difficult issues should be tackled and Q9 is by far the most ignored question with fifteen out of forty-two students leaving it blank; the runner up is Q18 with eight non-respondents. Those students who did respond to Q9 picked incorrect response "B" by a plurality. Of these ten "B" responders, only one showed any work on his test, that student (S15) actually got V = sqrt(mgr), but showed no numbers, thus indicating a math rather than a physics error. Unfortunately, the mesh of physics errors, math errors, and multiple choice format enhancement of laziness is hard to untangle. A separate math diagnosis test as used by Halloun and Hestenes in their "Instrument" would be beneficial to me as an instructor as Part III details in depth. Other easy and interesting issues include the choice of "C" in Q23. I actually couldn't see how "C" was chosen until one paper wrote (5m/s) / (6sec) for me and sure enough 0.83 pops out. The question now is why 5 m/s instead of 5m/s - 1m/s = 4m/s, which is then divided by 6sec to get the correct response. Is it conceptual difficulty with change in V = Xf - Xi, or is it a graph reading error. Six students out of the ten, wrote something on their test for this problem, some quite complex. I didn't understand five of them and would have enjoyed having a student explain some of these attempts to me; two in particular were very detailed. Q8 is amazing, we could get this question from seventeen correct to thirty-three correct, by just having our students agree that net force is a vector in the SAME direction as the acceleration vector, something that should be quite possible for third semester calculus-based physics students. For our final comparatively easy question Q24, students papers for those who chose "E" (fourteen students) are mostly blank; one shows (0.67)*6 = 4.02, another shows x = Vot + (0.5)at2x = 1(6) + (0.5)(2/3)(62) = 18m, and two show marks on the graph which I can't follow; the remaining ten show no work. As a side note ten of the fourteen who chose "E" on Q24 did get the correct acceleration on Q23. Student 8 should be applauded for remembering the distance formula and for showing his work both worthy accomplishments given his peer group and then reminded about the caveat of "constant acceleration" and how the graph distinctly shows that accelerate is not constant even though an average acceleration can be found.

A different method of analyzing Table Alpha is to partition it. I partitioned Table Alpha into nine parts by dividing between S27&S10, Q2&Q10, S37&S12, and Q3&Q4. The overall passing percentages per partition follow: upper left 28%, upper middle 17%, and upper right 05%; middle left 73%, middle middle 47%, middle right 20%; lower left 93%, lower middle 77% and lower right 49%. The passing rate for the test as a whole was 47%; the same as the middle middle partition. Random guessing would result in roughly a 20% passing rate. With the singular exception of S34, all students would benefit from directed instruction by the teacher on the five most difficult questions (Q12, Q5, Q18, Q9, and Q25). Students in the right most partition would as so benefit from directed instruction on the middle group questions perhaps in a mandatory discussion group environment also open to the other students on a voluntary basis. To offset the social stigma of mandatory participation perhaps a ten to twenty percent "extra credit" could be applied to participant's grades. The bottom five questions do not need teacher directed instruction. The left most group of students can act as peer instructors on these topics given their 93% pass rate. Homework or class work should be given on these topics to both increase the middle and right groups abilities in these subjects and to provide the left group with opportunities to be peer instructors. In-class cooperative grouping with one student from the left group, one student from the right group and two students from the middle group would prove beneficial to all, as the lowest group of questions cover a surprisingly wide array of topics. If such formal cooperative grouping is not part of the class structure, explicitly using the left group as tutors on homework in exchange for that wonderful motivator, extra credit, would go a long way in helping all students. Teacher explicit effort to build an environment conducive to student mastery of Newtonian concepts is required, as the literature indicates that without this effort, students are not likely to change.

In looking for questions that discriminate between groups, several stand out. Notably Q6 isolates the combined left and middle groups from the right group. The passing rates by group left to right are roughly 100%-90%-20%. Q8 is the strongest of several discriminators between the left group and the combined middle and right groups. The passing rates by group left to right are roughly 100%-20%-20%.



Chapter 17 : FCI Data on CCC Physics 4A


17.1 : Presentation of Group JZ's Raw FCI Data in Table A

Group JZ is Cabrillo Community College class 4A. It is the first in a sequence of four calculus-based introductory physics class; it has forty-four students. In this part, I present the cleaned up raw FCI data for group JZ in Table A. The clean up required dealing with: student #23, questions not responded to, questions eliciting a double response, student initiated written comments, and completion time notations. Table A is not hierarchical and presents raw data in student and question number order, excluding student 23. Student #23 mislabeled his responses so that he ended at question 30, when the original FCI is only a 29 question test. I was unable to figure out how to realign his responses and so left him out of consideration.

All questions not responded to are marked with pound signs ( # ) in Table A, and referred to as "black box". This includes student provided question marks, student numbered but not answered questions, and complete blanks with no student acknowledgment that they even saw or got to the question. There are twenty-six black boxes in Table A; this is two percent of all possible responses. These black boxes are distributed among five students. Of these five, four students failed to do the last page of the test, questions 26, 27, 28, and 29. Of these four students two (S2,S20) showed lack of time as the cause of their incompleteness. Of these four students, two others (S15,S33) showed an unawareness of the last page's existence.

All questions eliciting a double response are marked in Table A by presenting their first response in italics. In order to keep the test taking instructions between groups JZ and PG the same, students were allowed to mark a second response, if they were not confident of their first. The second response was clearly labeled as such, and fifteen students availed themselves of this opportunity (35%). They provided double responses a total of thirty-one times. The test itself had a total of 29x43 opportunities for response; thus, double responses are two and a half percent of all opportunities. Two questions (Q2 and Q15) had three students each, avail themselves of this opportunity; all other questions had fewer double responders. The double response usually had one correct response with only four out of thirty-one double responses having both responses incorrect. The remaining twenty-seven split 15/12. Fifteen students who got their first response correct, showed their uncertainty by picking incorrect second choices. Twelve students who got their first response wrong showed their uncertainty by then picking the correct choice a bit late.

All questions eliciting student initiated written comments have an * in that response's table cell. There were twenty-two unsolicited statements on the forty-three examined tests. Three students (S40, S36, and S33) did most of the commentary at four comments each. Six questions (Q1, Q3, Q5, Q10, Q16, and Q19) elicited two comments each, for a majority of all comments. Most comments could be categorized as having problems with the concept of "ideal space", whether these problems were real or legalistic, however, is open to interpretation. The following are a few examples. Question 10 elicited a couple of comments to the effect that rotation ("English" or "spinning") would impact the subsequent path of the ball. This could be viewed two lights.

First, there is reference to real world actual experience. Balls often spin at least to some degree, and with real world friction, this results in curved paths. This appeal to the real world is most evident in a comment in regard to question #29 by a student who got seventeen out of twenty-nine on the test (59%). He stated: "TRIED IT with my Pencil!", and then answered incorrectly with choice "A". This ties into the literature in two ways. First, epistemologically, conducting an experiment with concrete reference is commendable. Second, Hammer emphasizes how difficult it is for most beginning students to view knowledge threw the prism of ideal space. The problem for this student experimenter, most likely, is that if you stop applying a (small) force to a (high friction coefficient) pencil, it will seem to the human eyeball (a low resolution detection device) to "stop immediately" (choice "A"). The correct Newtonian response is "C", "immediately start slowing to a stop" as the problem does specify the existence of a frictional force with unknown magnitude.

The second way to view Q10's comments are not as an appeal to real world activities, but rather an appeal to legalism. There are legalists amongst our test takers which is made most evident by the two written comments to question #19. One response is by a student who got twenty-one out of twenty-nine on the test (72%) and the other respondent got twenty-five out of twenty-nine (86%). "If man pulls harder than boy", was the response of the first. "Assuming they pull different amounts" was the response of the second. Both students correctly answered "B", there by implicitly accepting the assumption they questioned, that in fact "a large man" does pull harder on the crate than "a boy" does. The students thus get both their politically correct disclaimers and credit for their politically non-correct answer. As a small aside, few students proved both double responses and written comments, with only six responders doing both.

The amount of time required to take the test varied considerably with the twenty-nine minute average based on a subset of students who provided the requested finishing time on their papers. Harvard University data in the original FCI paper, lists an average of twenty-three minutes















Table A



17.2 : Table B, A New Format


Table B is the same data as Table A in a more analytically friendly format and is the foundation for the rest of this chapter. Table B is hierarchical with the best student on the left, the worst on the right. The easiest question is at the bottom, the hardest at the top. The letter responses have been converted to blanks for correct responses, and to misconception codes for incorrect responses. The numbers in parenthesis are the number of blanks. Note, question mark (?) is a code in modified Table II, and some response grids hold two or three misconception codes because some MCSR items have more than one associated misconception code.

Before doing my own analysis I'd like to tie in to three ideas presented by Hestenes et al. in their original FCI paper. The first of these is their comment that Q5, Q9, Q12, Q22, and Q18 "lend themselves to analysis by force diagrams." This is odd given the literature's stance that the FCI does not require physics formalism, and is valuable as a pre-instruction assessment tool. No student in group JZ used force diagrams on any question. Thus, it is not surprising to find four of these questions in the upper (difficult) half of Table B. These five questions all deal with the Newtonian Concept 5G (Gravitational Force), but only Q5 does so uniquely. I1, I3, and CI1 look like strong distracters against 5G. Unfortunately, only I1 is used in multiple questions and then only twice. So on the face of it, it is difficult to know how much the lack of force diagrams has to do with the results, but Chapter 3 of this Data Analysis highlights the lack of force diagram usage by more experienced students on the MB. There also, questions associated to the use of force diagrams were found to be among the more difficult. In part, this difficulty can be traced to rare usage of force diagrams with incorrect diagrams predominating those drawn.

Hestenes et al. labeled Q19 and Q29 as "weak discriminators, so they could be dropped from the test." It is possible to arrive at correct answers for both via non-Newtonian beliefs. Q19 is in the easiest quadrant of Table B (79%). Excluding the five blanks, Q29 is also quite easy with a 74% pass rate. The down grade in Q19's value is particularly regrettable because it is the only question that pits the CI1 and CI2 misconceptions against each other, with a clear win for CI2. This is particularly noteworthy in comparison to Q18 where CI1 predominates over AF1, G1, and the three concept Newtonian response. Q29 also would be more valuable were it not undermined by the authors' comments particularly if its response "E" were altered to match student misconception AF1. Currently response "E" is not picked by any student and is associated to misconception I4. Were it altered, Q29 would make for an interesting comparison with Q15 given the common Newtonian Concept 5S, and their current relative positions.

Finally Hestenes et al. speak to the "persistence" of 3B and 3D test responses both of which match to student misconception G3 (heavier falls faster), and then compare Q3 to Q1. Group JZ data (Table B) confirms the vast disparity between Q3, and Q1 results if not the persistence. The cause of the disparity is not evident in the coding of the questions. Both questions have 5G alone as the correct Newtonian concept. Both have [G3] and [?] as the only two student misconceptions. So, by labeling alone these two questions are identical. Yet the results are not. After looking at the questions themselves, I believe Q3 requires additional knowledge about vector decomposition and the effect of orthogonal forces on motion that Q1 does not require. Thus, we are warned that questions which are labeled as exactly the same, may well not be. See pairs Q1& Q3, Q27& Q8, and Q14& Q13, pairs which show marked disparity in results, yet are labeled exactly the same.












Table B













17.3 : Table C, Misconception to Question

Table C is the same data as Table B presented by question and misconception. The numbers are the count, out of forty-three students, who share that misconception on that question. I use Table C by highlighting a given Newtonian concept, and I use it by question and misconception as well, although not by student as that data is lost in this format. A computer could make each number a vector of student identities and allow individual student Table Cs to be generated. Unfortunately such effort to identify individual initial knowledge structures is unwarranted as we shall see. Blanks in this table are unavailable options and zeros are available but unselected options. All rows will not add up to forty-three, even remembering the Newtonian thinkers, because a single MCSR item can be associated to more than one misconception.

Some misconceptions were chosen often, such as I3 (forty-eight times), but had low percentages (16%), because of their high prevalence in the FCI (301 opportunities). Other misconceptions such as G3 had high percentages (33%) which are diminished in value because of the absence of competing misconceptions. In the case of G3, it only competes with the Newtonian concept code, [5G], and the misconception code, [?]. In contrast, I3 competes with four Newtonian codes (5G, 1, 0. and 5S), and with nine misconception codes ( I5, CI2, CF, I4, R1, G5, I1, AF2, and G4). Further complicating the issue, I3 is offered on seven questions and thus is a high candidate to have been picked by students engaged in random selection (guessing). G3 is offered only on two questions, but is associated to two responses on each question.

In seeking student knowledge structures, Tables C and B work in tandem. From Table C, I might hypothesize that a student who believes G3 (heavier objects fall faster) at least has the possibility of also believing AR1 (greater mass implies greater force). Both do not directly compete against each other, both have decent percentages, and don't sound automatically self conflicting. Going back to Table B, we find that of the twenty-two students who picked G3 on Q3, only 28% (25 out of 88) picked AR1 when given the opportunity. Given that 20% is random guessing, 28% is not a call to arms. Worse, of the twenty-two students who picked G3 on Q3, only four chose G3 when given the opportunity to do so again on Q1 (18%). To complete the circle, of the six students who chose G3 on Q1 only four also chose it on Q3 (67%). This example highlights the fundamental result that I was unable to determine an alternative belief structure of student misconceptions that is strongly held and in competition with the Newtonian. Adding a third misconception such as K2 (velocity-acceleration undiscriminated) to tie AF1 and G3 together, reduce the results to unity. There are four students (S32, S42, S16, and S5) who chose AF1 in Q15, G3 in Q3, and K2 in Q21. These four had the opportunity to pick AF1 five more times each (Q18, Q13, Q22, Q11, and Q14) for a total of twenty opportunities; they picked it four times. These four students had the opportunity to pick G3 again on Q1, they did so zero times. They had the opportunity to pick K2 again on Q20 and did so twice. Not the definition of an alternative belief structure, even though S6 would be interesting to interview. The problem is if each individual has a unique structure made of common components, the teacher is reduced to addressing the most prevalent components piecemeal.

Table C is particularly useful in test analysis. It shows, with some specific highlighting, what distracters work best against a specified Newtonian concept and more importantly which do not. For example, student misconceptions code I3 (impetus dissipation) works well as a distracter in gravitational force problems, but fails to distract any students in kinematics problems, Thus, Q23 choice "E" would be a candidate for alteration to some other misconception code if the question itself could be so altered. This however is going far beyond my focus on the initial knowledge state of the student so I will leave this for now and introduce Table D in the upcoming section


















Table C



17.4 : Table D, Misconception to Large FCI Divisions

Table D is the result of misconception correlated to the larger divisions of the FCI. It is useful in conjunction with the previous tables, but primarily highlights the structure of the FCI itself rather than possible student states. One example is provided here with the remainders regulated to the last part of this chapter. It's evident from Table D that misconception G5 in a valid distracter for the Newtonian concept, Gravitational Force [5G]. G5, however is useless as a distracter against Kinematics [0]. The previous table allows misconceptions to be matched against specific questions. Q23 is the only question of interest, and item 23E is our focus. In this item, combined misconception G5 and I3 are pitted against the combined Newtonian concept of 5G and 0. Do not confuse 5G and G5; misconceptions always start with a letter and Newtonian concepts with a number]. G5 and I3 lose completely ( a null result). The original FCI paper's data also show that 23E is rarely to never chosen. MCSR items are precious resources. Another misconception should be allocated to item 23E. In the event, reading Q23 results in a recommendation to make 23E the mirror image of response 23C, but the following analysis indicates the desirability of finding some practical method of assigning misconception AF6 to item 23E. Q23 is the only question that combines Newtonian concepts 5G and 0. Misconceptions G5, CI2, AF6, I3, # and ? Are the misconceptions that overlap with both 5G and 0 at some point. G5 and I3 are null results for Kinematics [0], but do well against Gravitational Force [5G]. CI2 is the reverse it does well against 0, but does very poorly against 5G. AF6 effects both equally on separate questions (leaving aside the issue of the 2nd Law [2]). As AF6 seems to affect both 5G and 0, using it as a distracter in a question that combines 5G and 0 makes more sense than a I3/G5 combo nobody picks. Going back to the question 23 itself, reveals 23E, is a very odd choice indeed. Yet, the original FCI claims both G5 and I3 as valid useful distracters to kinematics which they are not.

One of the major claims of the FCI is that it assesses thirty specific misconceptions, not merely twenty-three divisions within the Newtonian force concept. Literature implies that students structure their misconceptions. To find these misconception structures, a test must be designed to pit misconceptions against each other not just against valid Newtonian concept. While null results are valuable doing test development, and even should be reported; they are much less valuable in general test application. MCSR tests are not a substitute for individual interviews but rather seek the frequency of known misconceptions in a large population. The precious MCSR items should not be wasted on known statistical nulls. Given the noise from low responses, the incomplete competition between misconceptions, and the large disparity of misconception availability, finding a structure, if such exists, is beyond my abilities. I have instead found changes that I'd like to make is the FCI and some all but random connections more notable for these oddity than for their utility. Still in an effort to show some of what I have gleaned from these tables, I will present some additional insights in the last part of this chapter.















Table D



17.5 : Additional Insights


In this Table B macro analysis, I drew lines at the junctions of Q13&Q5, S43&S17, Q23&Q17, and S26&S10. These roughly correspond to 25%, 50%, 25% divisions of axes and result in nine partitions. Q15 thru Q13 are labeled hard (H). Q17 thru Q10 are labeled easy (E). All other questions are labeled medium (M). S34 thru S43 are labeled Newtonian thinkers (N). S10 thru S32 are labeled non-Newtonian thinkers (non-N). All other students are labeled average(A). Numbers are correct percentages: NH = 61, AH = 31, non-NH = 06, NM = 91, AM = 58, non-NM = 27, NE = 98, AE = 80, and non-NE = 59.

In Table B, highlighting empty boxes left and right yields the insight that questions: 05, 21, 07, 20, 17, 04, 14, 01, and 10 are guaranteed to be gotten correct by Newtonian thinkers. Questions: 22, 25, 11, 08, 29, 16, 23, 19, and 27 are almost so guaranteed. Newtonian thinkers in the original paper are students with 80% mastery or 23 out of 29 questions correct. I use the same standard. The original paper labels 60% or 17 out of 29 questions as a minimal mastery, my non-Newtonians are at the 40% level (12 out of 29). Highlighting the filled boxes right to left makes evident that non-Newtonian students will fail to get questions: 18, 3, 13, 5, and 22 correct. They will have a below random chance at questions: 15, 28, 24, 09, and 07. They will finally achieve a random chance at questions : 2, 26, and 11. Single questions able to discriminate between Newtonian and non-Newtonian students are question 5, 22, 11, and 7.

If a student gets Q8 wrong, they will get Q3 wrong. Q8 is assessing Newtonian concept [1]. Q3 is assessing Newtonian concept [5G]. A choice of misconceptions I3 or I4 on Q8, guarantees a choice of misconception G3 on Q3. If a student gets Q20 wrong, he will get Q3 wrong. Q20 is assessing Newtonian concept [0]. A choice of K1 or K2 on Q20, also guarantees a choice of G3 on Q3. Interesting, but if a student gets Q8 wrong, he should also get Q1 wrong as it also is a [5G]. This is not the case. The same applies to Q20 and Q1. The relationship does not work the other direction. Getting Q1 wrong guarantees nothing about either Q20 or Q8. Artifacts of the non-correlation between Q1 and Q3in the first place. Q1 and Q3 are not correlated despite having the exact same Newtonian concept and misconception choices. Even though Q1 and Q3 are labeled the same, they aren't, as the results show. I believe Q3 requires an understanding that forces in orthogonal directions are independent, which Q1 does not. Further, if I3/I4 correlates to G3 and K1/K2 correlates to G3. It is reasonable to look for correlation between I3/I4 and K1/K2. None exist, although this is muddied by the lack of any common competitor.

Misconception AF1 is not self correlated across six questions. This is odd because AF1 was the most prevalent answer to the hardest question, Q15. This implies people prefer AF1 to both Newtonian concept [5S] and misconceptions [0b, ?]. AF1 (rating 15%) is never pitted against 5S again and does badly against Newtonian concept [3], and misconceptions: AR2 (29%), I2 (12%), and CI1 (28%). On the other hand, in the only other question in which 5S is alone (Q29), 5S does quite well against misconceptions: I3 (16%), R1 (22%), and AF2 (14%). So why should AF1 kick 5S butt (Q15) and AF2 fail to have a similar impact on 5S (Q29). Particularly as they have essentially the same rating? Is it because of intrinsic differences between AF1 and AF2, or is it the presence of the R1 distracter in one question and not the other? Perhaps all 5S questions are not equal. Q15's 5S is solid contact impulsive and Q29's 5S is solid contact friction opposes motion. Further diluting AF1 as a prime candidate as an alternative belief, is the odd fact that a student who picked AF1 on Q15 [5S] is twelve to three more likely to pick AR2 over AF1 on Q13 [3].

AR2 is the most consistent of distracters. It is applied against 3rd Law problems only (Q13, Q11, Q14) and competes against the same misconceptions (AF1, AR1, and Ob). If you pick AR2 on Q14 (rating 79%) there is a six out of seven chance of picking it on Q13 (rating 40%). This strong relationship does not hold both ways. Student who pick AR2 on Q13 are more likely to get Q14 correct than to pick AR2 again at a ratio of fourteen to six. So like Q1&Q3, Q13&Q14 are labeled the same but have disparity in the results. Q13&Q14 are mentioned in the original FCI paper but only to say that they "appear different to most students". First note: Ob is a worthless distracter for 3rd Law problems. Reassigning these MCSR items to other distracters would benefit internal analysis as long as awareness of Ob results was not lost. Second note: Q2 is the fourth 3rd Law problem and is paired with Q11 as "Impulsive Force". Q2 does not use AR2 as a misconception. The overwhelming choice of AR1 on Q2 does not indicate a stable misconception as AR1 is pitted against AR2 on Q13 and loses four to twenty. Further, AR1 loses again 0 to seven on Q14. The apparent balance in Q11 an artifact of AR1 and AR2 sharing a common MCSR item. Thus AR2 reigns supreme as the alternative to the 3rd Law. This supremacy hides an unknown additional factor as the rating difference between Q13 and Q14 brings to light. Third note, independent of distracter choice, if you get Q14 wrong, you have roughly a 85% chance of getting Q11, Q2 or Q13 wrong. If you get Q11 wrong you have close to a 95% chance of blowing Q2 or Q13. If you get Q2 wrong, you have a 75% chance of blowing Q13. This is odd because Q2 and Q11 are impulse form and any correlation to Q13 and Q14 (continuous force) should be the same. These correlations only work one way. For example, eight selectors of AR2 on Q13 are happy to get both Q11 and Q14 correct.

I4 was a distracter for Newtonian concepts 0,1, 2, and 5S. It suffered a null response against 5S (Q29). Even leaving out Q29, there is no correlation between choosing I4 as a distracter in any two situations. Q8 and Q27 both dealt with Newton's 1st Law, both shared the same sub-category of "no force" and "speed constant". In both Q8 and Q27 the only competing misconception was I3. Thus like pairs Q1&Q3 and Q13&Q14, Q8 (ranking 65%) and Q27 (ranking 81%) are an identical pair. And again we have a ranking disparity, which in this case would be worse if you exclude the four students who never got to Q27 in the first place. Interestingly, excluding blanks, picking I4 on one of the pair, guaranteed you the other was correct! Two questions labeled as the same, with very different results. The labels seem to be, at best, incomplete. Q27 is part of a series of questions on the same subject. Thus the previous questions may have helped guide the student toward the Newtonian choice, but without student interviews this is conjecture. Radically different results on identical questions should raise all kinds of red flags and spoken to in the presenting paper, including acknowledgment of explicit but unlabeled language or math issues impacting choices in addition to the physics misconceptions. McDermott does not use the FCI, and I'd be interested in the reason. Finally, if a student picked I4 on Q24, he was most likely (83%) a Newtonian or Average student. Non-Newtonian students strongly preferred (75%) distracter C12 over I4. In a probable fluke, if you chose I4 on Q24 you were guaranteed to pass both Q10 and Q19.

Having found and exhausted the only pairs of identical questions, life because more complicated but there are a few more items of interest. The non-self correlation of misconception CI2 across its six questions is hardly surprising given that these questions encompass five major divisions of force [0,1,2,3,4, and 5G]. What is surprising is that a single misconception could be viewed as a valid distracter for so wide a range of Newtonian concepts. CI2 fails to distract in Q16, Q4, and Q1. Again we argue MCSR items are actually valuable commodities and, outside of test development, should not be wasted on null responses. Misconceptions G4 or CI1 might prove more useful as distracters in Q16 than CI2 in light of Q5 and Q18. It's tantalizing in our search for a students alternative belief structure to note that students who pick CI2 on Q24 also do not pick G1 on Q12. Thus tying by a negative interference the Newtonian concepts 2/0 and 5S/5G. Unfortunately, this connection between CI2 and G1 does not hold for Q19 responses; however, Q19 does have an entirely different Newtonian concept [4].

Q22 and Q9 both have misconception I1 as a distracter and Newtonian concept 5G as part of the multiple Newtonian concepts each addresses. The competing distracters differ, but twenty students chose I1 in each case. However, only thirteen students chose I1 on both questions. Five Q9 responders chose the correct response and two picked AF1 over I1 on Q22. Three Q22 responders chose the correct response and four picked Ob/G1 over I1 on Q9. Worse, I1 is a complete null on Q16 which is a pure 5G question; it does have a different set of distracters. It would prove interesting to change Q9's G2 distracter (choice E) to a G5 distracter to see if there is an impact on I1 choice.

In what are most likely flukes, the following questions correlate even though they share neither Newtonian Concepts nor distracters. If they appear in other class data, they would be quite valuable ties between misconceptions - the beginning of a system. If Q4 is wrong, Q5 is wrong. If Q14 is wrong, Q9 is wrong. Because of multiple misconceptions per MCSR item, the following is a bit vague but if you picked an item associated with I2 on Q6, you had a 91% chance of picking an item associated with I3 on Q5.

The initial knowledge state of the student must be determined by some method, and the FCI is PER's flagship. I did not use the revised edition, but still some thoughts on the FCI might be worth considering. I doubt that G2 or AF3 should be included on the test given not only their null responses here but the very low responses documented in the original FCI paper. I also doubt that the FCI as it stands is a valid assessment for either G2 or AF3 anyway. AF3 is assessed once, as item 12E; AF3 is "no motion implies no force." Question 12 is the question with two correct answers and option E's "none of these?" is highly unlikely to be picked over 12A which at least acknowledges the existence of a gravitational force. G2 is assessed three times as items 5E, 9E, and 17D. G2 is "gravity intrinsic to mass". 5E is "none of the above, the ball falls back down to earth simply because that's its natural action." 9E is "gravity does not exert force on the puck it falls because of the intrinsic tendency of the object to fall to its natural place". And 17B is "falls because of the intrinsic tendency of all objects to fall toward the earth." Leaving aside arguments as to what gravity is, I see these three responses arguing against the existence of gravity particularly in light of the competing items. For example, 5D is "a constant downward force of gravity only". So 5E is with in a philosophic hair of rejecting the existence of gravity no matter whether "intrinsic to mass" or not. I'm not surprised by the null results for G2 simply because "intrinsic" or not, everybody believes in the existence of gravity. Even if I for one, would never claim to understand it. For all I know gravity is intrinsic to mass. I've never seen gravity without mass. Any two or more masses give you gravity, and spherical cows aside, I know of no truly isolate single mass with which no other masses interact. An interesting thought experiment, but I'm not a theorist. On a final note, I don't believe misconception [Ob] really needs six opportunity to be chosen; particularly as those in competition with the 3rd Law are wasted. Perhaps an AF1 choice for Q2 instead of the current Ob could be worked out. AF1 involves "active agents" and there is some uncertainty about how rigid the author's definition is, but this might work. Have one vehicle parked and on the other one moving. Thus the moving vehicle could be the "active agent" and the parked one, the non-active agent in setting up misconception AF1.



Chapter 18 : FCI Data on CCC Physics 11


18.1 : Presentation of Group PG's Raw FCI Data in Table M


Table M is the cleaned up raw data for Cabrillo Community College Physics 11 class, which is labeled PG. Physics 11 is a one semester algebra based course for students with no prior physics experience; it is used as a Physics 4A preparatory class. Clean up deals with: double answers allowed by instructor directions, blanks both adverted and inadvertent, unsolicited comments, solicited high school physics status, and time of completion. Fifteen students provided thirty-eight double answers; this was out of thirty-three students and 957 response opportunities. Out of these thirty-eight double answers, twelve had correct first answers and wrong second answers; eleven were the reverse, wrong answer first, correct answer second. Fifteen responses were wrong both times. S28 doubled up eight times, S24 five times, S33 and S13 four times each, five students twice each, and seven students doubled up on their answers once. Eighteen students did not provide double answers. Seven questions had no double answers, and eight questions had a single double response each. Twelve questions had two double responses, and two questions (Q4 and Q15) had three. There were twelve blanks. All questions had zero (18) or one (10) blank excepting Q28 which had two blanks. Only six students left blanks.S27 left the four questions on the last page blank with no indication that he had even seen the questions, further he did have time to complete the test. S6 deliberately left three blanks but with no explanations (two of the problems involved graphs). The remaining four students left one blank each, except S32 who left two. There were very few students who both double answered and left blanks. S27 and S6 offered no doubled answers; S28, S24, S33, and S13, conversely provided no blanks. There were eleven unsolicited comments by seven students. S18 and S20 provide three comments each; all other students provided one each. Q8 elicited three comments; Q27 elicited two. All other questions had zero or one comment. S18 had two comments to back up his two answers to Q8. He picked 8A "assuming no friction due to air", and he picked 8C "assuming the presence of atmosphere with friction." S20 answered 8C and then commented "* Forgot the description of friction?" And finally, S29 picked 8A and said "* IF FORCE THAT KEPT VELOCITY CONSTANT INITIALLY STILL APPLIES AFTER `KICK', THEN I THINK THAT THERE WOULD BE A SLIGHT INCREASE IN SPEED IMMEDIATELY AFTER 'KICK', THEN A RETURN TO `CONSTANT VELOCITY?'" Seven students had a high school physics class (S2, S4, S9, S20, S21, S25, and S30); one student (S17) had a "conceptual physics college class five years ago". Group PG's average is 35%; the sub-group of the above eight students has an average of 30%. The sub-group of the seven students with high school physics experience only has an average of 33%. Student interviews might revival why previous high school experience is of no statistical help at the community college level. It could be that unsuccessful high school physics students are trying to brush up on their skills by taking this class. Still, I would expect them to do better than S20 does with her Q8 comment: "forgot the description of friction". The high school group's poor showing may be that result of time passage, S21 informs us that in 1988 he got an "A"; it's now 2002. Or, high school physics classes may not all be equal, S25 with eight correct responses, "took High School Physics @ Harbor High w/ Mr. Grove [and] Got a B final grade." The average time of completion using the subset of nine students who provided their completion times is thirty minutes. On a final note, it is interesting to see S5's (twenty-two correct answers) response to the inquire about previous high school physics experience: "I have had some physics experience thru personal interest."













Table M



18.2 : Table N, A New Format


Table N is the hierarchical presentation of Table M data with MCSR items replaced by misconception codes from modified Table II in chapter 1 of this Data Analysis section. The right hand side has both Newtonian code for correct answers [blank boxes] and misconception codes available for student choice. Questions range from Q18 and Q5 which had one correct response each, to Q10 with twenty-eight correct responses. The average correct response for the questions is twelve out of thirty-three students. Students range from S5 with twenty-two correct responses, to S17 with three. The average correct responses for the students are ten out twenty-nine questions. In comparison the median for the questions was only nine not twelve, and the median for the students was nine not ten. It's painfully obvious from Table N that only the easiest questions are understood and then only by the better students. In looking at JZ's flukes (Chapter 4, part 5), if Q4 is wrong, Q5 is wrong if only because exactly one student got Q5 correct. If Q14 is wrong, Q9 is not guaranteed to be wrong (S11, S7, S27, and S26). Finally, if I2 is picked on Q6, there is a only a 80% chance at I3 on Q5 which is close to the 75% chance any Q5 responder has of selecting I3. So despite one technical match, I'm willing to forego considering then ties in a misconception belief system.












Table N














18.3 : Analysis

Let's look at where the students are in this class. They do better on pure 1st Law [1] and pure Gravitational Force [5G] problems than on any other major divisions. The pure Third Law [3] problems and Superposition [4] are the worst. Q19 stands out with its 82% correct. But as the literature makes clear, Q19 is such a poorly designed problem that it was dropped from the revised FCI because the correct answer is selectable for non-Newtonian reasons. So I discount it here.

The pure 1st Laws (Q26, Q8, Q4, Q27, and Q10) show strong belief in I2 and I3. They also show belief in I4 and I5. Impetus (I1-I5) is described in the FCI paper. Impetus is an inanimate motive power that keeps things moving. It must be supplied, kind of like gasoline, and the moving object must store it, and use it up the way a car does gasoline. Impetus might be more easily addressed via the concepts of energy and friction than a head on clash with Newton's First Law.

Better is a relative term, and the pure 5G problems (Q5, Q3, Q17, Q16, and Q1) are scattered across the board. Q5 is tied for hardest and Q1 third easiest. Given an Impetus option, the students take it. Particularly I3, Impetus dissipation. After all cars stop when they run out of gasoline; they don't coast forever. Telling students to ignore friction is meaningless if they can't define friction (S20) and all but meaningless to the rest because students don't have frictionless real life experiences to provide a context for these idealized questions. Even television has the Starship Enterprises stop when its engines stop; in order to be moving anywhere its engines have to be on. To cap it all off the engines need fuel. The perfect definition of impetus in space which for our students is a frictionless environment, solar wind and microscopic meteorites not withstanding.

Other than Impetus, G3 and to a lesser extent G4 and G5 are attractive distracters on the pure 5Gs, and G1 is picked often enough on the mixed 5G problems to warrant attention. G5 ties gravity to impetus. G4, gravity increases as objects fall, highlights the reverse problem of students being too theoretical and the problem too practical. Gravity does increase the closer an object gets the earth's surface. The problem is both a matter on scale and a decision as to what is ignorable. Physicists are not mathematicians and often, if not always, ignore variations of one part in 1013. Students and physicist don't ignore the same things. Physicists ignores friction which students don't, ditto for changes in gravity. Explicitly explaining to students, the advantages of absurd simplifications and the conditions under which physics label quantities as zero may go a long way toward having students' observable performances match our expectations. G3, heavier objects fall faster, is slightly tricky in that heavy objects are often more dense and thus, tend to have smaller surface areas; this given the real world air resistance, does allow them to hit the floor first in non-rigorous kitchen experiments. Worse, human eyeballs and reaction times in clicking timing devises are not ignorable factors in observing or timing short distance drops. The concrete referent of an evacuated transparent tube in which a feather and coin can be dropped together is very helpful in changing this belief in G3. A discussion on sky diving may also help if it is not too theoretical, and takes into account terminal velocity. G1, air pressure-assisted gravity, mixes the ideas of number density of air molecules, a directional result of gravity [more molecules per unit volume closer to earth], with the idea of pressure direction exerted by those molecules. The pressure of an air molecule is assumed to rely on its weight, much like the pressure a kid exerts on a trampoline. Thus the more kids, the more pressure in the same direction as their weights. That the pressure due to the random-walk, high-speed, motion of the air molecules, overwhelms its mere weight, is not a normal awareness as air inside a classroom seems quiescent, not in high-speed motion. A discussion of how odors cross a room may help shift the students awareness of his air-molecule surroundings. The classic crushed can experiment is also beneficial.

All 3rd Law (Q2, Q11, Q13, and Q14) questions are pure and none is easy. Even Q14 is in the upper half of the plane. AR1 and AR2 are overwhelming believed. Part of the problem, is the equation F = ma. This formula implies, forces are singular, not paired, and that changes in "m" do (instead only can) change "F". F = ma strongly implies a single mass in absolute isolation can exert a force. The form of Newton's Law of Gravitation F = G(M1*M2/R2) at least implies the force is between two bodies and is the same magnitude. Multiplying M1*M2 is the same after all as multiplying M2*M1, even if M1 is not equal to M2. Thus in the Law of Gravity, the Third Law about paired forces between two objects being equal, is more believable. With F = ma, Force on, Force by, and Force net are confusing and overshadowed by the math simplicity that if I change m, then F is going to change. This implicit act of keeping "a" the same is hidden by the label "a" itself not changing to say "a (prime)", when the value of m is altered. Further the literature highlights that the words such as Force, Energy, Power, etc. that physics uses with absolute mathematical precision are viewed as synonyms in everyday speech. And it is quite true that the parties (truck and car) in a 3rd Law problem very often do not have the same kinetic energy and thus by sloppy everyday English are seen as not having the same Force. Newton's 3rd Law is merely a mantra, if these underlying math and language issues are ignored. Then there is the whole "action/reaction" definition that implies a time delay between two separate events: I hit you, you hit me back. Not I hit you and simultaneously your nose is hitting my fist, which is why boxers wear gloves. This belief in two separate, time-delayed events adds to student difficulties with the 3rd Law. After all, if the events are separate and time delayed what "memory" requires them to be equal? It's kind of like believing that dice are going to roll a seven merely because this particular pair hasn't rolled a seven in the last five attempts so now it is "due". If we say the third law is "for" every action there is an equal but opposite reaction", and if we say F = ma, then I'm very surprised anybody believes the 3rd Law, although some may remember the existence of such an oddity. Telling people physics English is not common English; telling people a single event has effected 2 objects simultaneous, and there is an effect we call force which has the same magnitude but opposite direction on each object, and that in the math of F = ma : changing "m" is not guaranteed to effect "F" because we are not guaranteed to be able to hold "a" constant; instead of "F" perhaps "a" or perhaps both change when "m" does, all of this would all help build a foundation on which Newton's 3rd Law is believable and useful. After all, in every math class this kid ever had, if not told to change "a" then it does not change. F = ma, let m = 2 what's F equal? Let m = 4 what's the answer? Trick question what's the relationship between the two Fs? There EQUAL! Yea right, last time I checked 2a did not equal 4a.

The Superposition Principle actually has a pure question in Q19; as spoken to previously, it's discounted. The remaining three Superposition questions are among the most difficult questions. Leaving aside Impetus, R2 and CI1 are the big misconception winners, and they are the same thing if "resistance" is viewed as a weaker force. In reality, CI1, largest force determines motion, and R2, motion when force overcomes resistance, are actually true! In a typical one dimensional chalkboard example, the largest force does determine motion as there are only two directions possible for the forces and we never show multiple small forces overcoming a single large force. Further, certainly in our static friction examples, there is no motion until the force does overcome the resistance. So, labeling these as misconceptions is a bit much, perhaps misapplied would be the more palatable. I suspect other problems, notable the idea that you need a leftover force in order to move after balancing out any opposing forces: i.e. problems with Newton's 1st Law. Things move on their own with no outside or inside requirements. Once moving they just continue to more for NO reason. Effects need causes for most people. They have yet to forge the odd way of thinking that for Q18: constant velocity = no change in velocity = no acceleration = zero acceleration = zero force (net) = in one direction, force (up) equals force (down). This is a lot more a matter of definitions than superposition; students have to focus on the code words "constant velocity" to the exclusion of all else, and then run a string of translations. This leaves aside the recurring problem of acceleration / velocity discrimination and Force / Energy / Momentum discrimination. After all, to go anywhere one clearly need "left over" momentum, and energy and velocity, just not force or acceleration.

The two remaining major divisions are the Second Law and Solid Contact Force. The 2nd Law is never pure, three times it's linked to Kinematics and once to the First Law. Solid Contact Force is pure twice and linked to 5G alone once and linked to the combo [5G] and [4] once. CI2 and CI3 are the misconceptions of choice in competition with Newton's 2nd Law [2]. CI2, force compromise determines motion, is true if "compromise" is rigorously defined vector addition. So, in my mind, this becomes almost a math issue instead of a physics content misconception. Besides it's easier to think of Q24 in terms of momentum anyway with one orthogonal component constant and the other steadily increasing from zero to significantly faster than its original component, but you still need math, not physics, to separate 24C (CI2) from 24E (correct). Belief in CI3, last force to act determines motion, is quite scary. It implies that what your doing now will not effect the future if there is any intervening action. This is not true in the real world anymore than in the idealized, so interviews with these students would be a valuable learning experience for me! By the way, I picked 24D and do not believe in I4. I do believe in scale. Given both the speed most "rockets" orbit the earth and their low thrust capabilities after discarding their boosters, trajectory 24D still makes sense to me particularly given the FCI authors comments about the scale effects of gravity (one part in 1013) being ignorable. And yes, this is the trap of having real world considerations intrude into idealized space, a point our students need highlighted as well. The point is that I chose a MCSR item for a reason other than the listed misconception of I4, "gradual / delayed impetus building". I buy into gradual and building. I don't buy into delayed or impetus. Scale considerations are only acknowledged in the comments to G3 misconceptions by the FCI authors. They are not acknowledged in the misconceptions themselves. Math and scaling causes for item selection are ignored to the detriment of the test's utility. Wrapping up, the two pure 5G, Solid Contact Force, problems do not share common misconception choices which is unfortunate. Still, it looks as if AF1 is the idea to address, particularly as even the best students picked it. AF1, only active agents exert forces, means according to the FCI authors that "usually living things... by direct contact... cause motion". If a student selects 15A or 15B, then according to Table V, he believes in AF1. The problem is 15A and 15B, do not involve a "living thing", they do not explicitly specify direct contact as do 15C and 15D, and causing motion and changing the direction of motion are not automatically the same thing. The above alone is a slim reed but 15A "energy of the ball is conserved" and 15B "momentum of the ball is conserved" seem to have many other reasons to be picked than AF1, not least being momentum of the system is conserved and is usually the way these bouncing ball problems are done. Further, given that the ball and floor don't break apart or stick to each other, conservation of energy is true. So in 15A students may forget vectors or that Energy are not a vector. In 15B, word games (important ones) between "system" and "ball" may be the root cause of error. Or the implied time delay in 15C may cause some to shy away from the Newtonian response "...stops ...and then..." without student interviews I'm leery of buying into AF1 as the root cause for 15A or 15B selection.



Chapter 19 : Conclusion to the Data Analysis


I was unable to find a few common unifying mental models comprised of a limited and patterned set of misconceptions. I thus take refuge in Redish's Paper 30 in which he advances the Individuality Principle: "each individual constructs his or her own mental ecology, different students have different mental models for physical phenomena...."

Analysis of individual students is very useful for teacher-student interactions. Were I teacher JM, knowing that S10 completly disagreed with expert opinion on the Independence Cluster and that S01 completly agreed with expert opinion on the Math Link Cluster, would greatly enhance my ability to teach S10 physics and to use S01 as a peer instructor.

Written work and comments, or their absence, on these multiple choice tests provides insight into the state of the class as a whole. Were I teacher CF, the general lack of force diagram use by my third-semester, calculus-based-physics students on the MB would be the catalyst for both explicit instruction on test taking techniques and a pop quiz on force diagrams. The pop quiz would be to test my assumption that the students know how to construct force diagrams but were lolled by the multiple choice format to expend minimal effort. After all, I could be wrong, a majority of students may have wanted to use force diagrams and been unable, or they may have not recognized the applicability of force diagrams when lacking explicate instructions.

So, while I was unable to categorize students by a few, well-defined, non-Newtonian, mental models, the tests proved to be quite beneficial as pre-instruction reality checks.













Part III : The Thesis

Chapter 20 : Introduction to a Thesis


I have no desire to repeat my previous introductions. I needed to write Part III both to learn the material and to enjoy my learning. Having read Part I is as thorough an introduction as possible on the material. If you chose to dig into the meat of my effort, thank you. May you find some diamonds in your future mines (credit to D. Goodstein).



CHAPTER 21 : Assessment Tools


20.1 : Paper 01, Force Concept Inventory

Following this introductory paragraph is a review of D. Hestenes et al. paper, "Force Concept Inventory", commonly referred to as the FCI. The FCI is the touchstone of PER literature; it is used in a large majority of papers which address Newtonian Mechanics. Hake's average normalized gain, <g>, uses pre- and post-instruction FCI scores in its derivation. <g> is then in turn used quite extensively to judge the effectiveness of various curricula. Specifically, it is the main criteria by which Interactive-engagement (I.E.) instruction is judged to be more effective than Traditional instruction. Thus a large portion of current PER results rely on the soundness of the FCI. There are criticisms of the FCI, notably that it is a qualitative, not a quantitative, test and that its use of normal everyday language, instead of physics formalism, leads to some ambiguity. There was a minor revision some years ago, with the most up-to-date version distinguishable by having thirty questions versus twenty-nine in the original. The change has had no noticeable effect, unlike the first shift from Mechanics Diagnostics (MD) to the FCI. The FCI is unique in PER in that its distractors [wrong multiple choice answers] have explicit meaning and are commonsense non-Newtonian widely held beliefs. In fact, analysis of wrong answers can tell an instructor a wealth of information about his students. It is this, rather than <g>, that makes this test most useful to an individual instructor.


D. Hestenes, M. Wells, and G. Swackhamer; "Force Concept Inventory," The Physics Teacher Vol. 30, 141-158 (1992)


The FCI was designed to improve on the Mechanics Diagnostic test (MD). The FCI has a systematic and complete profile of the various non-Newtonian misconceptions as they relate to force and Newton's Laws. The authors believe 80% on the FCI to be the "threshold" score for Newtonian thinkers. Furthermore, data suggests that an FCI score below 60% indicates the student's grasp of Newtonian concepts is insufficient for effective problem solving. Additionally, data suggests that student scores are unlikely to surpass their teacher's score. It is noted that "The FCI score should be viewed as an upper-bound on a student's Newtonian understanding" as interviews show that students sometimes choose Newtonian choices for non-Newtonian reasons.

The FCI assesses the student's overall grasp of the Newtonian concept of force. It decomposes this concept into six components: Kinematics, First Law, Second Law, Third Law, Superposition Principle, and Kinds of Force. The FCI can be used as a diagnostic tool, for evaluating instruction, and as a placement exam for advanced college courses. The FCI is not an intelligence test, it is a probe of a belief system. It should not be used to place students in regular versus honors high school classes. Additionally, there is no correlation between scores and socioeconomic ranking of the schools used during test validation.

The fundamental reason the FCI is not "just another physics test" is that the incorrect choices are correlated to specific misconceptions. These misconceptions are very important as they must be overcome and replaced by the non-commonsense Newtonian thinking before the student is asked to progress further in his physics education. Otherwise, the student is building his educational edifice on a foundation of sand. The FCI probes 28 distinct misconceptions in six major commonsense categories: Kinematics, Impetus, Active Force, Action/Reaction Pairs, Concatenation of Influences, and Other Influences on Motion. These 28 misconceptions are matched to corresponding incorrect answers in Table II of paper 1. A few examples of misconceptions are: velocity-acceleration undiscriminated, circular impetus, no motion implies no force, largest force determines motion, mass makes things stop, and heavier objects fall faster. These errors are commonsense misconceptions which are reasonable hypotheses grounded in everyday experience. Some of these errors were firmly held by Galileo and even Newton. That these misconceptions are, in fact, false is often NOT easy to prove.

There is a section on Overcoming Misconceptions which stresses the "unitary concept of force", and that the instructor should anticipate the 28 misconceptions. The instructor should discuss specific misconceptions, focus student attention on crucial issues, and bring discussions to a "satisfying closure". The authors argue that effective instruction requires more than dedication and subject knowledge; it requires technical knowledge about how students think and learn. There is the additional note that the misconceptions most difficult to overcome are the impetus concept of motion and the dominance principle.

The FCI was given to 1500 high school students and 500 university students during validation, primarily in the state of Arizona. This study indicates that attitude, not intelligence nor mathematical competence, is the prime cause of greater achievement. In the case of Arizona high school students, their attitude is primarily attributed to family influence. This pre-Hake paper argues that post-test results are essentially independent of pretest scores and reiterates an earlier paper's findings that for conventional instruction [Mechanics Diagnostic] post-test scores are instructor independent. The paper goes on and discusses the Wells method. This method is computer-based, laboratory oriented instruction with no lectures and is not compatible with large lecture class format used in college introductory physics classes. The Wells method did result in comparatively high post FCI results, but it had not yet been successfully used by other instructors at the published date of paper 1. At Harvard, the average time to take the FCI was 23 minutes. At Arizona State University 40 minutes was given for test taking. Most high schools gave a full 50 minute class period.


21.2 : Paper 02, Average Normalized Gain; <g>

Following this introductory paragraph is a review of R.R. Hake's paper, "Interactive-engagement versus traditional methods...." The main tool used to show Interactive-engagement's (I.E.) superiority over Traditional Methods in physics instruction is the average normalized gain, sometimes referred to as the "Hake factor" and symbolized by <g>. Paper 2 is unusual in that its sample size is six thousand students, whereas most PER data is derived from a few hundred students. Those papers based on interview data sometimes focus on only ten or twenty students. <g> takes into account fluctuations in FCI pretest scores; these scores fluctuate significantly, contrary to what was asserted in the original FCI paper. Hake also uses a couple of additional tests to support his basic thesis that I.E. instruction is better than Traditional, but this paper's basic tool, <g>, is what future papers will most often use themselves. I should also point out that the fairly clear distinction in this paper between I.E. and Traditional becomes blurred in future work. For example, Interactive Lecture Demonstrations (ILD) "enhances" traditional lectures and some I.E. methods such as Studio Physics at Rensselaer are no "better" than Traditional methods. Both of the above judgments are based on the use of <g>.


R. Hake, "Interactive-engagement versus traditional methods: A six-thousand-student survey of mechanics test data for introductory physics courses," Am. J. Phys. 66(1), 64-74 (1998)


Each student gets a "g" which is FCI post-instruction test score minus FCI pre-instruction test score all divided by the quantity: 100% minus FCI pre-instruction test score, where all scores are in percentage form. Each class gets a <g> which is the same math with each score replaced by its class average. Each type of instruction (I.E. and Traditional) get a <<g>>NP which is the average of <g> over N courses of type P. The basic result is <<g>>14T = 0.23 ± 0.04 and <<g>>48IE = 0.48 ± 0.14. Thus, I.E. is a two sigma winner over traditional methods. The next big issue is how I.E. and Traditional are defined in this paper, but there are several smaller issues to be addressed first. The greater standard deviation for I.E. is attributed to "the variety of I.E. options and the varying effectiveness of implementation". There exists a wide range of course average pretest scores, 18% through 71%. Data gathering was biased in favor of high gaining classes. Due to the self reporting nature of data collection, low gains are usually neither published nor communicated. To increase statistical reliability, averages over courses are limited to those courses enrolling twenty or more students. The use of average normalized gain, <g>, instead of average post-test scores or average gain is defended in this paper. Interactive Engagement (I.E.) is defined as:


methods as those designed at least in part to promote conceptual understanding through interactive engagement of students in heads-on (always) and hands-on (usually) activities which yield immediate feedback through discussion with peers and instructors.


Traditional methods are defined as relying "primarily on passive-student lectures, recipe labs, and algorithmic-problem exams." For paper 2, 6548 students in 62 introductory physics courses were divided into forty-eight I.E. courses and 14 Traditional courses. The paper mentions many I.E. programs by name, such as: Overview Case Studies , Active Learning Problem Sets , Concept Tests, S.D.I. labs, and Minute Papers. While many of the I.E. courses have lower enrollments, four are highlighted as having enrollments of over 200 students. These four make use of collaborative peer instruction and employ undergraduate students to augment the instructional staff. At Indiana University where Hake teaches, I.E. activities include team teaching, a "Physics Forum" open 5-8 hrs/day, color coding of displacement, velocity, acceleration, and force vectors in all components of the course, and the use of grading acronyms.

The key to <g> is the FCI, and Hake spends some time addressing FCI-specific issues. He contends that "most physicists would probably agree that a low score on the FCI / MD indicates a lack of understanding of the basic concepts of Mechanics". He points out the existence of pro and con arguments as to whether a high FCI score really is an indicator of attaining a unified force concept. He then notes that whether yea or nay, the literature labels the FCI as the "best test currently (1998) available." He addresses five systematic errors that are "sometimes involved" in FCI testing. First, question ambiguities and isolated false positives. The revised FCI (1995) was used, though with "little impact" on <g> . Interview data suggests this first issue is rare, and these errors would effect both I.E. and Traditional classes equally; thus they have little effect on the differences in their gains. Second, teaching to the test and test-question leakage. Buried in his reference 48 is his belief that both the FCI and MB (Mechanics Baseline) will have "useful lives" only until 1999 and that new and better tests are both sorely needed and should be treated with the confidentiality of the MCAT. That being his belief for the future, in the paper itself, he argues qualitatively that as of 1998 this issue is not yet a problem. Third, courses spend varying amounts of time on mechanics; the assumption is that students would show higher Hake factors in courses devoting more time to mechanics. Hake does an institution comparison of course time spent on mechanics and finds that the gain difference between I.E. and Traditional courses is "not very sensitive" to this possible systematic error. Fourth, a failure of students to do their best work on the pretest can artificially raise <g>. A failure of students to do their best work on the post-test can artificially lower <g>. Hake asserts that students did in fact take both tests seriously in part due to responses on instructor surveys and in part because <g> showed minimal fluctuations whether or not explicit grade motivations were offered by the instructor. Fifth, there are effects which produce short-term benefits independent of any intrinsic worth of instructional method. Two of these are: The Hawthorne effect, where the research test group benefits simply by receiving special attention, and the John Henry effect, where the control group benefits by a competitive desire to out perform the test group. The Hawthorne effect is discounted for the I.E. groups by taking a subset of long standing courses and comparing its average to the total groups. The John Henry is ignored as it would only further increase <<g>>48IE - <<g>>14T. While noting that there was no "reliable quantitative estimate of the influence of systematic errors", the general uniformity of results suggests that the 2-sigma difference in average normalized gains between I.E. and Traditional courses is primarily a reflection of pedagogical effectiveness and/or implementation.

There are four additional points to raise briefly. First, Hake also uses the Mechanics Baseline (MB) test. The MB is a more quantitative test that the FCI and is usually given as a post-test. The MB and FCI scores show an extremely strong positive correlation, with MB scores about 15% below FCI averages. The MB has its detractors. For some, it does not sufficiently probe advanced abilities required in "context rich", "experiment", "goal-less" or "out-of-lab" problems. For others, its problems are not sufficiently similar to those in Halliday-Resnick. Second, all of these tests are multiple choice, with random guessing yielding a score of 20%. However, it is quite possible for non-Newtonian thinkers to score below 20% on the FCI, due to "very powerful interview-generated distractors". Third, Hake concludes his paper with four sections that strongly advocate the use of I.E. methods. In summary, "The use of I.E. strategies can increase mechanics course effectiveness well beyond that obtained with traditional methods". Finally fourth, in an epilogue, he speaks to his fear that history shows that the best of efforts may have little lasting impact. Hakes paper by virtue of its 6000 student base and its introduction of <g> as a tool for instructional comparison, is one of the most referenced PER papers.


21.3 : Paper 03, Mechanics Diagnostic

Following this introductory paragraph is a review of I. Halloun and D. Hestenes paper, "The initial knowledge state of college physics students". D. Hestenes is an author of both the FCI paper and the Mechanics Diagnostic, which is the FCI's forefather. The most interesting thing about paper 3 is its inclusion of a Math Diagnostic test in its Competence Index. This is the last paper (1985), of which I am aware, in which math is explicitly incorporated into a definition of physics competence. A large majority of PER is qualitative in nature and a large portion of it has an implicit anti-math stance. As an example, P. Lindenfeld in his "Format and content in introductory physics" letter published in Am. J. Phys. Jan. 2002, states "My own opinion is that we are spending too much time trying to improve the mathematical facility of our students. I am not sure that to do so is our primary responsibility." In fact, it seems to not be our secondary responsibility either. Lindenfeld equates teaching physics through mathematical procedures to teaching poetry "by parsing, analysis, and rhyme structure, with little energy left for thought and substance." He is, in my opinion, a representative voice for a large segment of PER. In the MD paper, math ability counts for half of the Competence Index, an acknowledgment of the language of physics which I wish was emulated by others. Part of math's devaluation is that while the MD is included as an appendix to their published paper, the math diagnostic test was not. Worse, there is a belittling comment that the reader could make up his own. As an aside, there is a Math Project at UCLA that provides validated multiple choice math diagnostic tests for free and provides free grading. Tests are available for all subdivisions of math including algebra and calculus. The following short review is not comprehensive since the MD has been superseded by the FCI. Nevertheless, certain aspects should be highlighted. The reader is cautioned that the MD and the MB are totally different tests.


I. Halloun and D. Hestenes, "The initial knowledge state of college physics students," Am. J. Phys. 53(11), 1043-1055 (1985)


Each student entering physics possesses a system of beliefs and intuitions about physical phenomena derived from his life. This is his common sense theory of the physical world. The student uses it to interpret what he uses and what he hears in his physics course. Conventional physics instruction fails to take this non-Newtonian common sense theory into account. Because common sense theories are both non-Newtonian and very stable [conventional instruction does little to change them], students systematically misinterpret material in introductory physics courses. This discrepancy between common sense theory and Newtonian physics is brought to light by "the Instrument".

The Instrument assesses the student's Basic Knowledge State by two tests. The first is a physics diagnostic test which assesses the students qualitative concepts of common physics. The second is a math diagnostic test to assess the student's mathematical skills. The Mechanics (physics) diagnostic test assesses the student's qualitative conceptions of motion and its causes; it also identifies common misconceptions. The early versions required written answers and the most common misconceptions were selected as the alternative answers in the final multiple choice version. Particular items were chosen to highlight the major differences between common sense and Newton concepts. The basic kinematical items are position, distance, motion, time, velocity and acceleration. The basic dynamical items are force, resistance, vacuum, and gravity. The Instrument also includes a thirty-three question multiple choice mathematics diagnostic test which is not included paper 3. The authors note that the math errors were not completely random and that error patterns indicated common misconceptions, but these were not analyzed. They note that MD and math pretests assess independent components of a student's initial knowledge state and that high mathematical competence is not sufficient for high performance in physics [there is no comment as to whether it is a necessary condition].

The Instrument was given to 1500 Arizona State University students and eighty high school students, distributed among six professors and one teacher. The very low pretest scores for high school students led the authors to state that "physics instruction in high school should have a different emphasis than it has in college... the low scores indicate that [high school] students are prone to misinterpreting almost everything they see and learn in a physics class." Furthermore, given that the styles of the four lecturers in university physics vary widely within the formula of conventional instruction, and that all four basically had the same knowledge gains, the authors conclude that "basic knowledge gain under conventional instruction is essentially independent of the professor." The instrument is recommended for use as a placement exam, to evaluate instruction, and as a diagnostic test. Its validity and reliability were addressed by a variety of means including: giving it to graduate students, interviewing undergraduate takers, and performing the Kuder-Richardson test of reliability. It was found that differences in academic background have small effect on performance in introductory physics, with no effect found deriving from gender, age, academic major, and high school mathematics differences.

The Instrument generates the Competency Index (C.I.), which is a practical measure of the student's knowledge state . The C.I. is the raw pretest scores of both diagnostics tests added together for a maximum of sixty-nine. It was found that end-of-class grades correlated strongly to the pre-instruction C.I. scores. Students who had a C.I. < 30 had a 95% chance of getting a C or worse. The authors argue that such C.I. recipients should be "considered candidates for a pre physics preparatory course." The overall small gains and low pretest scores mean that many students continually misunderstand the material presented. In particular, a low score on the MD does not mean simply that basic concepts of Newtonian mechanics are missing; it means that alternative misconceptions about mechanics are firmly in place. "To ignore the initial common sense knowledge in physics instruction is akin to ignoring initial conditions in integrating differential equations."


21.4 : Paper 04, Mechanics Baseline

Following this introductory paragraph is a review of D. Hestenes and M. Wells, "A Mechanics Baseline Test." This comparatively short paper presents the FCI's stepbrother, the Mechanics Baseline Test (MB). The MB appears more quantitative than the FCI and uses more physics formalism. It is used almost exclusively as a post-test. In the literature, it is often used to show that a focus on concepts and less in-class problem solving are not detrimental to student results on a quantitative test. Unlike its brother, the multiple choice distractors are not value laden. As already mentioned, please note that the MB is the not same thing as the MD.


D. Hestenes and M. Wells, "Mechanics Baseline Test," The Physics Teacher Vol. 30, 159-166 (1992)


The Mechanics Baseline Test is a universal, basic, mechanics concept assessor of student understanding. There exists an extensive baseline of post-instruction scores which allows for evaluating and comparing the effectiveness of instruction. The best use of the test is for post-instruction evaluation. The MB is a step above the FCI in assessing mechanics understanding. The FCI was designed for students without formal training in mechanics, to elicit their preconceptions. The MB emphasizes concepts which require that formal training. The MB and the FCI are complementary and together provide a "fairly complete profile" of Newtonian understanding. The main intent of the MB is to assess qualitative understanding although it looks like a conventional quantitative test. Unlike the FCI, the MB distractors are not "common sense alternatives" although they do include "typical student mistakes". Problems that can be solved by plugging into a formula were excluded.

The MB is not easy. Students at all levels get low scores. This despite less than a third of the questions requiring algebraic manipulation or more than one step reasoning, and despite the exclusion of advanced topics such as angular momentum. Tables highlight the Newtonian Concepts on the MB, provide the correct responses, and provide the percentage of correct answers by various groups (mostly Arizona high schools). The MB addresses kinematics thoroughly with twelve questions. The MB addresses the conservation laws for energy and momentum in both the work-energy and the impulse-momentum forms. It also has subsets of questions addressing Calculation and Diagrams. Widespread deficiencies in the qualitative understanding of acceleration were found among both students and among university introductory physics instructors!

The MB is graphically referenced to post-instruction FCI data with the authors asserting that a good score on the FCI is a necessary but not sufficient condition for a good score on the MB. Sixty percent on the FCI is a "conceptual threshold" needed for effective problem solving on the MB, and eighty percent on the FCI is a "mastery threshold" required to achieve more than eighty percent on the MB. This paper ends with a "Note added in proof" by Prof. Eric Mazur of Harvard University, who outlines a procedure of interacting with his lecture class that raised his FCI / MB class averages from 77% / 60% to 85% / 72%.

While the end of Paper 4 is a good segue into ConcepTests and Peer Instruction, we aren't yet finished with the assessment tools by which most of the literature judges its success. The upcoming review will be our last Mechanics test. It will be followed by some tests focusing on other aspects of physics including E&M and Thermodynamics. I chose to place assessment tools at the beginning of Part III because what tool you use to determine the success or failure of your procedure is almost as critical as how you define success in the first place.


21.5 : Paper 05, Force and Motion Conceptual Evaluation

Following this introductory paragraph is a review of R. Thornton and D. Sokoloff's paper "Assessing student learning... the Force and Motion Conceptual Evaluation..." (FMCE). Paper 5, published in 1998, seems to be the successor of the 1992 FCI. Its use would address some, if by no means all, of Hake's reference 48 issues. Notably, if the useful life of the FCI and MB ended in 1999 then despite its nonsecure status the FMCE should be their replacement. This does not seem to be what's happened. The FCI and <g> are currently (2002) in use, and I have rarely seen usage of the FMCE. In particular, I have found no paper correlating FMCE scores to FCI scores, which would allow FCI-based papers to still be useful after such a transition. I suspect it's a feet and meters kind of thing; even if a conversion scale existed, an education community comfortable in its traditional ways feels no need to transition to the FMCE despite the prevalent use of FCI test questions in other forums. For example, the FCI question 10 was used as a concept test in a Cabrillo Community College physics class last semester. Further, an instructor at Cabrillo, who gave the FCI as a pretest this semester, promptly handed it back, after grading, so students could study it. This is what he does with all of his tests. So, while I do not believe there is a concerted effort to bias FCI data, the unconcerted result is concerning. The FCI is a ten year old public test with really neat questions that instructors find useful and interesting. The FMCE, on the other hand, is basically an unknown. It lacks explicit meaning for its distractors, and it lacks the equivalent of Hake's 6000 student survey which brought widespread respect to the FCI. In an effort to increase awareness, I present the following review.


"R. Thornton and D. Sokoloff, "Assessing student learning of Newton's laws: The Force and Motion Conceptual Evaluation and the Evaluation of Active Learning Laboratory and Lecture Curricula," Am. J. Phys. 66(4), 338-352 (1998)


Paper 5 presents the FMCE. It also uses the results of a subset of four FMCE dynamical questions to justify an instructional method that combines "the Tools for Scientific Thinking (TST) Motion and Force", the "Real Time Physics (RTP) Mechanics, and Interactive Lecture Demonstrations (ILD)". The first two (TST and RTP) are microcomputer-based laboratory (MBL) curricula. The total gain as measured by the FMCE for using all three methods was an incredible 75%! Unfortunately, combining these two goals results in a minimally useful FMCE. The authors imply that wrong answers can be used to evaluate student views, but they offer no explicit methodology to accomplish this end. The authors "are able to identify statistically most student views from the patterns of answers and because there are very few random answers." Unfortunately, they don't provide us with the statistical patterns, nor with a listing of the different student views. What they do provide is a "focus" on dynamics concepts, as probed by four sets of questions: the Force Sled, the Cart on a Ramp, the Coin Toss and the Force Graph. These sets are a substantial chunk of the 43 question test.

The Force Sled (questions 1-7) and the Force Graph (questions 14-21) were given as a pre and post-test to 240 students at the University of Oregon. The total improvement due to traditional instruction averaged seven percent. The Force Sled (FS) and Force Graph (FG) ask about similar motions in very different ways. The FS uses natural language as much as possible and explicitly describes the force. The FG uses graphical representations in an explicit coordinate system but does not explicitly describe the force. If students do much better answering the FG than the FS, it is possible that their English Language skills are weak. Conversely, if students answer Question 15 (FG) incorrectly, it is likely that they are unable to read a graph. Question 5 (FS) is designed to identify students who are just beginning to consider Newton's First Law. Question 6 (FS) should be interpreted cautiously, as 40% of physics faculty chose the incorrect answer, F. Questions 1-4 and 7 are used to make a composite average labeled "natural language evaluation". Questions 14 and 16-21 make a composite average labeled "graphical evaluation".

There are no more specifics. On a general note, the authors assert that there is a learning hierarchy formed first by kinematics and then by dynamics. An improved student understanding of kinematics also improves student learning of dynamics. The validity of the FMCE is addressed by observing the results of those students labeled Newtonian Thinkers (seven out of 8 questions right on the Force Graph) on other tests, and on their written explanations to the Cart on Ramp, question 9. It is noted that guessing correctly is very difficult because there are up to nine choices, with choices being derived from an open-ended questioning process during student interviews. Finally, paper 5 is one of the few papers to mention retention of knowledge. Retention, after six weeks in which there was no additional dynamics instruction, was superb with an increase in students answering in a Newtonian way. This increase (approx. 6%) is attributed to assimilation of concepts.

Well that's it for mechanics tests. While a great deal of PER focuses on Introductory Physics, Mechanics, Newton's Laws, Force, and acceleration, recently there has been an expansion into a wider range of physics domains, and thus a need for assessment tools in those domains. The next couple of reviews introduce assessment tools for Electromagnetism and Thermodynamics.


21.6 : Paper 06, Conceptual Survey in Electricity and Magnetism

The following is a review of D. Maloney's, et al. paper in which the Conceptual Survey in Electricity and Magnetism (CSEM) is introduced and used. This recent paper (2001) is an attempt to provide PER with a tool for E&M which is analogous to the FCI for Mechanics - a standard which allows various research and different programs to be compared and contrasted quantitatively.


D. Maloney, T. O'Kuma, C. Hieggelke, A. Van Heuvelen, "Surveying students' conceptual knowledge of electricity and magnetism," PER Am. J. Phys. Suppl. 69(7), S12-S23 (2001)


The FCI has raised the consciousness of many physics teachers about the effectiveness of traditional education in educating students about basic kinematics and Newton's three laws. To assess students' knowledge of electricity and magnetism, the authors have developed the CSEM. The CSEM is a broad survey instrument for use in general physics courses. It can be used to assess a student's initial knowledge state as well as the effect of various curricula and instructional methods on improving that state. The CSEM has a number of significant differences in comparison to the FCI; not least is its reliance on other domains, such as force, motion and energy. The CSEM deliberately excludes DC circuits because the test needed to be shortened, and there are other instruments for assessing student understanding of DC circuits. The iterative four year development process started with questions gathered in a workshop and worked its way through an open-ended version to gather valid distractors. What began as two separate tests, one for electricity and one for magnetism, were combined. The final stages of revision were based on feedback from instructors who evaluated and/or administered the earlier versions.

The quality of individual test items and the quality of the overall test are addressed; two measures of item quality are difficulty and discrimination. Difficulty is the percentage of students who get the item correct; for the CSEM, the difficulty ranges from 0.1 to 0.8. There are only seven items with a difficulty over 0.6, which is less than ideal. Discrimination assesses how well a test question differentiates between students in the top 27% (Nu) of the entire test, and the students in the bottom 27% (NL). A value called the "Item discriminator" (Id) is created where: (NU - NL) ÷ (NL - NU) = 1/2 Id. CSEM item discriminators range from 0.10 up to 0.55. All but four questions out of thirty-two have Ids above the traditional lower limit of acceptability, 0.20. Difficulty and discrimination are correlated with the full range of discrimination only available to those problems with a difficulty of 0.5. Two standard measures of overall quality are validity and reliability. Validity was assessed by forty-two community college instructors using a 5 point scale. A table provides the mean and standard deviation for each question's validity. Algebra-based courses and Calculus-based courses are presented separately. All means are above 4.0. Reliability is calculated by the KR20 test; the authors cite a reference and give a brief description. Essentially, the actual test is broken into two tests, each consisting of half the items; the correlation between performance on these two subtests is calculated. This is repeated for all possible half-item subtests. The KR20 post-test estimates for the CSEM are around 0.75. Reliabilities of 0.9 to 1.0 are rare. Reliabilities of 0.8 to 0.9 indicate that a test can be used for both individual and group evaluation. Reliabilities of 0.7 to 0.8 are common for well-made cognitive tests. Reliabilities of 0.6 to 0.7 are weak cognitive tests but acceptable for personality tests. Reliabilities of 0.5 to 0.6 are common for well-made classroom tests.

Further validation was done by running factor analyses (principle component) on the CSEM. Factor analysis looks for significantly correlated groups of test questions. Eleven factors were found, with the largest at 16%. The improvement of factor structure requires additional questions which would increase testing time, a sensitive issue with classroom instructors. The 16% is very small for a first factor, and while the eleven are mathematically identifiable, they are not considered meaningful by the authors. The CSEM is a valid reliable instrument that probes both the limited existing set of student alternative concepts and aspects of E&M formalism.

Having presented the instrument, the authors use it. The result for algebra-based pre-instruction testing is 25% ± 8%; post-instruction testing is 44% ± 13%. The result for calculus-based pretesting is 31% ± 10%; post is 47% ± 16%. The sample size total is approximately 1500 students. There is a noticeable disparity between the results on the electricity questions (1-20) and on the magnetism questions (21-32). Scores were lower on magnetism questions by roughly ten percent depending on the bin. Question results are presented with student percentages for each of the multiple choice letter responses, as broken out by pre/post test and algebra/calculus based.

The CSEM has eleven conceptual areas; the paper addresses seven of them. First, Conductors and Insulators: There are a substantial number of students who can not distinguish between conductors and insulators. Second, Coulomb's Law: students seem to believe larger charge magnitudes exert larger forces than smaller charge magnitudes. Third, Force and Field Superposition: students confuse magnetic field effects with electrical field effects. Fourth, Force, Field, Work and Electric Potential: students still associate constant velocity with a constant force and cannot deduce the direction of the electric field from a change in a potential. Fifth, Magnetic Force: getting students to see if an electric charge has a velocity with a component perpendicular to the magnetic field direction is very difficult in magnetic force problems. Sixth, Faraday's Law: students do not see a collapsing loop as changing magnetic flux or rotating loops as not changing that flux. Seventh, Newton's Third Law: over 50% of students fail to believe Newton's Third Law extends to electromagnetic situations.

The CSEM provides an estimate of student learning for some important ideas in electricity and magnetism; it can provide guidance for research. Research needs to be done in determining the nature of students' alternative ideas about topics in electricity and magnetism. pretest responses show recurrent patterns, some of which are highly resistant to change by traditional instruction. Interwoven in this unclarified area is language use and students' interpretations of language use. Fractional gains [<g>s using FMCE scores] range from 15% up to 60%, clustering around 30%. Additional research on instructor strategies is needed to determine the impact of particular techniques on student performance.

As a concluding paragraph of my own, I would like to highlight an issue: Validation. This paper presented a far more in-depth presentation of assessment instrument validation than any other paper I have read in PER. While I have never performed a KR20 test, I can but contrast this with the validation presented by the authors of the FCI paper (paper 1):


Formal procedures to establish the validity and reliability of the FCI are UNNECESSARY [my highlight] because of its similarity to the Mechanics Diagnostic for which considerable care was taken to validate.


This casual attitude to validity is extensive. For example, the validity of the FMCE (paper 5) is addressed by observing the results of those students labeled "Newtonian Thinkers" (seven out of 8 questions right on the Force Graph) on "other tests" and on their "written explanations to the Cart on Ramp question 9." This nonexistent or casual validation leaves open legitimate skepticism and illegitimate skepticism that these tests actually are what they profess to be: the bedrock on which instructional change is built. Reducing skepticism is one important factor in achieving widespread use of PER results, and therefore validation should be done thoroughly and seriously.

The final full physics-content test presented in this chapter is the Thermal Concept Evaluation (TCE). I say full test because many of the research papers include a question or five, written and used by the authors for their paper-specific purposes. A few of these questions are designed to be added to the FCI, but many of them are stand-alone little quizzes. I say physics-content because PER also extends beyond content into epistemology. Epistemology has its own set of tests, one of which, the Maryland Physics Expectations Survey (MPEX), I will review as paper 32.


21.7 : Paper 07, Thermal Concept Evaluation

S. Yeo and M. Zadnik's upcoming paper was published in "The Physics Teacher", a very enjoyable magazine. Most of this literature review is of papers from that source and from the American Journal of Physics (Am. J. Phys.). The Am. J. Phys. is a far more scholarly journal which, in addition to a monthly issue, puts out a yearly PER supplement. There are a large number of additional PER sources. For example, the back of the PER Am. J. Phys. Suppl. 68(7) lists 39 papers in thirteen publications for 1999 alone.


S. Yeo and M. Zadnik, "Introductory Thermal Concept Evaluation: Assessing Students' Understanding," The Physics Teacher Vol. 39, 496-504 (2001)


The authors present an instrument that is almost the FCI of Thermal Physics. I say almost because while naive alternative concepts are used as distractors [wrong answers for Multiple Choice Single Response (MCSR) test questions] and matched to specific questions in a table, the distractors are not matched to specific answer choices within the question. Thus, the reader is left to match misconception to answer; a step I would have preferred the authors to have done. Further, the correct answers are not noted, and I was uncertain of the correct responses to three questions. The answers I chose are in agreement with those chosen by my mentor, who also took this test. Having contrasted it to the FCI, the comparison is still good; the TCE is valuable to the Thermodynamics professor. The specific alternative conceptions: Heat and Temperature are the same thing, Skin or Touch can determine temperature, Heat rises, the bubbles in boiling water contain "air", "oxygen" or "nothing", will surprise few, However, the shear number of thirty-five Alternative Thermal Physics Conceptions will provide some insight for everybody. Paper 7 also addresses Instrument Development, Testing the Instrument, and Test Validity. The TCE is provided as Appendix I to paper 7. It has been made available for use as a pre/post-test, for assessing alternative concepts at any point of instruction, and for planning instruction or remediation. Paper 7 is very similar to the FCI paper in its non-content specific parts, and the content specific alternative beliefs are presented, not developed. Yeo and Zadnik have provided a valuable tool for the Thermodynamics professor.


21.8 : Paper 08, Concentration Analysis

Tests can provide and are most often used for an overall score. However, individual questions or subsets of questions can provide useful information to instructors and students. A method to extract information internal to a test is presented by L. Bao and E. Redish in their paper a "Concentration analysis..." They devise an algorithm applicable to "any" MCSR test, and they demonstrate their technique on the FCI. The basic idea is one I find valuable, and in the Cabrillo Community College Data Analyses Section, I have applied their Concentration analyses to MB data. My basic problem with their paper lies in their application of this method to the FCI. As I understand their math, each MCSR item is independent and unique; this is not true for the FCI. As an example: Bao and Redish label FCI question 15 as a "LL", which means a "near random situation." This is false, because both distractor "a" and distractor "b" are representing the same misconception. This single misconception according to the FCI paper (paper 1) is labeled "AR1: greater mass implies greater force." Thus, what at first glance looks like a distribution: a = 23, b = 38, c = 32, d = 0, e = 6 [data from AVH table V paper 1] is in reality: AF1 = 61, 5S = 32, Ob = 0, ? = 6. 5S represents the correct Newton response; the other symbols represent various specific misconceptions. What were three seemingly different and near equally distributed answers (a, b, c) are in fact two different, non-equally distributed answers (AF1 & 5S). Thus, question 15 is really a "MM": "Two popular models one of which is the correct answer." Having dispatched the idea that FCI choices are independent, I will now offer evidence that they are not always unique. FCI question #11 is labeled by Bao and Redish as "MM" which, as has already been stated, means "Two popular models one of which is the correct answer." The raw data, using AVH from paper 1 because Bao & Redish don't provide raw data, is a = 2, b = 4, c = 4, d = 21, e = 68. Unfortunately d is not unique; it can be chosen for two separate reasons: AR1 and/or AR2. AR1 is greater mass implies greater force; AR2 is most active agent produces greatest force. Without delving into the definition of "active agent", suffice it to say that it does not mean mass. Thus we do not have two models, we have three. Two of the three are entangled in the distractor "d" and thus inseparable by this question. Having argued against the application of concentration analysis to the FCI, I did use this analysis on Cabrillo MB data. I did this in part because I was interested in the results and in part to make sure I understood what the authors were proposing. Paper 8's quantitative analysis and graphical displays imply a greater certainty than the reader should accept, but it does provide avenues for further exploration and allows for manipulation of large amounts of data. The fundamental fact is that each choice on a MCSR test is mathematically assumed to be independent and unique. To the extent that this assumption is false is to the extent that the end results are suspect.


L. Bao and E. Redish, "Concentration analysis: A quantitative assessment of student states," PER Am. J. Phys. Suppl. 69(7) S45-S53 (2001)


Qualitative research based on interviews and analysis of open-ended problem solving has documented different clusters of semi-consistent reasoning that students use in responding to physics problems. This knowledge has been used to create attractive distractors for multiple choice examinations such as the FCI. In examining large populations, student choice among wrong distractors contains information as valuable as the grosser distinction between correct and incorrect, that has been the focus of most research. The authors have developed an algorithm that extracts and displays how students produce incorrect answers. Their method analyzes the concentration/diversity of student responses to particular multiple-choice questions.

In constructing a model of student knowledge, the authors appeal to neuroscience, cognitive science, and educational research. The agreed upon core research elements are: (1) memory is associative; (2) cognitive responses are productive; and (3) cognitive responses are context dependent (inclusive of students state of mind). This alone is not enough of a base, so the authors also focus on several structures proposed by researchers: (a) patterns of associations (neural nets), (b) primitives/facets, (c) schema, (d) mental models, and (e) physical models. The authors define these terms. They desire to determine the effectiveness of a particular multiple choice question in triggering the small number of research identified common naive schema or mental models. If a multiple-choice question is designed with these naive mental models as distractors, then the distribution of student responses yields information on the student's state. The student who has a strong naive belief will pick multiple wrong answers that are based on that unifying mental model. Students who simply lack knowledge will choose distractors randomly.

The authors create a concentration factor (C). This factor measures the distribution of responses to a question on a scale of 0 to 1. C = 0 is an even distribution with all choices (A, B, C, D, E) selected by the same number of students. C = 1 is the other extreme with all students selecting the same choice. Thus each test question gets a C value from 0 to 1 that has no relationship to correctness or incorrectness. This C value merely reflects the distribution of choices. Paper 8 explains the math formula used to create C and verifies C's range. The concentration factor (C) is used to study several different aspects of student data. One study labels a question with two letters. The first (L = low, M = middle, H = high) characterizes the student's scores; the second is derived by binning the concentration factor. For example, a LH question implies an incorrect prevalent model, i.e. a low score and a high concentration. A table provides the implications of the two letter labels. Not all permutations are possible as score and concentration are not independent. Pattern shifts from pre- to post-instruction reflect the impact of instruction.

Rather than gross binning (low, medium, high), the score (S) and the concentration factor (C) of a question can be displayed as a point (S,C) on an S-C plot. S-C plots are shown; they have boundaries due to the constraint between S and C. The score (S) is the number of students who chose the correct choice out of all possible students (N); on the plots, S is normalized. A new variable G is defined as the new concentration factor of incorrect responses only. It shows more detail and is determined by the removal of the absolute offset created by the score. G is called the concentration deviation. C and G highlight different aspects of the data.

An example is provided by analyzing the FCI pre and post-test results from fourteen introductory calculus-based physics classes at the University of Maryland. The reference provides the scores (S), the concentration factor (C) and the two letter label (HH...LL) for each question on the FCI, based on pretest data from 778 students. LH and LM questions are analyzed with all of them addressing either the naive model that motion requires an unbalanced force, or the naive model that the larger or more active agent will produce the larger force. Two LL questions are addressed. It is argued that no naive model accounts for the low score, low concentration. Both questions deal in detailed physical processes that require the integration of various pieces of physics knowledge [see my introductory paragraph]. S-C plots are offered for Traditional versus Tutorial instruction. The average shift is larger for the tutorial classes, increasing in both S and C, showing more students holding the single correct model. Traditional classes shift, but only into a two model region where a significant number of students still hold incorrect beliefs. S-C plots are used on the FCI subset of Low Scoring Questions (2, 5, 9, 13, 15, 18, 22, 24, 28) and on the Force-Motion Questions (5, 9, 18, 22, 28) showing the same result. S-G plots are also presented and analyzed. In comparing S-G plots of Traditionally instructed classes to those of Tutorially instructed classes, Questions 9 and 22 stand out. For Question 9, Traditional instruction changed student distractor choice from b to c. This implies that while students learn to recognize the "normal force," they continue to believe that a force is needed in the direction of motion. There is no comment on Question 22.

Concentration factoring can facilitate test development, instruction, and assessment. It can help confirm the presence and prevalence of erroneous models detected through research. It allows the detection of questions that do not have a relevant distractor, and can lead to improving multiple-choice tests. Information on how the majority of students get a question wrong cannot be analyzed using test scores alone; concentration factoring provides important clues for improving instruction.

Excluding the MPEX (paper 32), that ends our review of assessment tools. PER uses these tools primarily to establish the ascendancy of Interactive Engagement (I.E.) methods/curricula over those labeled Traditional. Interactive Engagement can be roughly divided into how physics-content should be taught and what physics-content should be taught. How outweighs what in the literature and is our next focus. Three points are in order before we advance. First, how and what are not mutually exclusive; fragments of one will be intermixed in a paper dominated by the other. Second, the work of L.C. McDermott's University of Washington Physics Education Group, will be presented later in this paper even though their final product, the tutorials, is very focused on how to teach physics-content. Third, epistemological issues which are yet to be focused on, play an extensive role in a few of these papers. PER is more of a web than a logic tree. Thus, it's not always possible to present the material in an ordered linear format.



CHAPTER 22 : Interactive Engagement Methods that Retain the Lecture


The main stay of traditional physics teaching methodology is and certainly has been the lecture. Due in large part to the miserable <g>s of lecture classes, the lecture is and will continue to be viewed as ineffective and probably detrimental to the student's learning of physics. Having said that, most physics students are still subjected to the lecture and are so subjected for more reasons than mere bureaucratic inertia. Thus we shall lead off our I.E. methods section by some papers that seek to retain the lecture but improve its <g>s (effectiveness). The upcoming paper is on an I.E. method that is often referred to by other papers; wherein it enjoys a good reputation.


22.1 : Paper 09, Interactive Lecture Demonstrations


D. Sokoloff and R. Thornton, "Using Interactive Lecture Demonstrations to Create an Active Learning Environment," The Physics Teacher Vol. 35, 340-347 (1997)


Efforts to improve physics education while maintaining existing structures has resulted in Tools for Scientific Thinking Microcomputer-Based Interactive Lecture Demonstrations (ILDs). ILDs are an extension of the authors' Microcomputer Based Learning (MBL) curricula for introductory physics, Laboratory Tools for Scientific Thinking, and Real Time Physics. An ILD is an eight step procedure that engages students actively via individual predictions, small-group discussion with nearest neighbors , and after the MBL measured demonstration, the completion of a results sheet. For the instructor, picking the appropriate moment to move on to the next step is important, as is having a definite agenda for the last two steps, which are "wrap-up results" and "extension of this concept into different physical situations." ILD's uniqueness is the real time data provided by the MBL tools. The ILDs must be presented in a manner that builds student confidence in the measurement devices. Flashy exciting demonstrations are eschewed as too complex to be effective learning experiences. A sequence of ILDs, say on Human Motion, takes about 40 minutes and makes use of the motion detector, force probe, Universal Laboratory Interface and Tools for Scientific Thinking Software. Evaluation of student learning was done by an in-house test and the FMCE. A summary of pre- and post-instruction results are examined for a subset of problems: Force Sled, Force Graph, Cart on Ramp, and Coin Toss. Evaluation of a University of Oregon, 200 student, non-calculus, introductory physics lecture showed a huge improvement on the 10% overall gain due to traditional lectures. Replacing 80 minutes of traditional lecture with ILDs resulted in more than a 50% overall gain. Evaluation at Tufts University confirmed the Oregon results. The actual Force Sled and Coin Toss questions are in the paper as is the ILD prediction sheet. The paper also has some demonstration descriptions from Newton's First and Second Law ILDs.

I want to raise three points. First, these authors also authored the FMCE; they just reversed precedence. Second, use of computers in I.E. classes is a controversial topic and will be dealt with in some detail in papers 22 through 25. Third, very little of PER is strictly technical in nature, but given the reliance of ILD on equipment, I thought the following paper might be of interest. D. Maclsaac and A. Hämäläinen in "Physics and Technical Characteristics of Ultrasonic Sonar Systems", published in the Physics Teacher vol. 40, give a very detailed analysis of the SX-70 ultrasonic ranging system components, patented and manufactured by Polaroid Corp. This system is used in most introductory physics lab ultrasonic motion detectors. Their analysis includes history, beam patterns, blind spots, buzzing sounds, resolution, precision, accuracy, common classroom difficulties, and pedagogy. Any user of motion detectors would benefit from reading this information, particularly before doing an instructional demonstration in front of a couple hundred students.


22.2 : Paper 10, Audience Paced Feedback


J. Poulis, C. Massen, E. Rubens and M. Gilbert, "Physics lecturing with audience paced feedback," Am. J. Phys. 66(5), 439-441 (1998)


Traditional lecture format is flawed, and the use of computers and multimedia to improve upon them is constrained. Paper 10 provides a technique to improve lectures; it involves Audience Paced Feedback (APF). APF is the provision for each student in the lecture theater to have an electronic handset which allows each/all to answer simple binary questions from the lecturer. There are four question formats possible: (1) exploration, (2) verification, (3) interrogation, and (4) organization. APF is fundamentally different than the raise-your-hands-to-answer style because: (a) all students are answering, (b) the lecturer can ask multiple choice questions, (c) the students replies are anonymous, (d) the lecture format shifts closer to that of a seminar, and (e) a permanent record is possible.

The students rated APF lectures at 6.7 and non-APF lectures at 5.1 on a one to nine Likert Scale (nine being very strong positive). The pass rate of APF lectures was approximately 87%, while for the non-APF lectures it was approximately 58%. APF lectures also had a smaller standard deviation. The sample size for both APF and non-APF was approximately 2600 students. APF allows the lecturer to ensure that the majority of the student body has understood the material before moving on. APF also gives the students an active role in the lecture. During many questions, students are given time to discuss the problem which brings an element of student-to-student teaching into the lecture. Finally, there is a small Hawthorne effect due to the unusual and positive environment.

The above paper has many twin brothers, the most equal being Eric Mazur's book, Peer Instruction, A Users Manual, ISBN 0-13-565441-6. All of these focus on achieving student feedback to the lecturer; how feedback is achieved and what kind of feedback occurs vary. What the lecturer is supposed to do with the feedback also varies. Nevertheless, there is a strong push in lecture-enhancement PER to construct a student-to-lecturer feedback loop primarily to reduce the passivity of the student. There is also a strong push to incorporate student-to-student discussion/instruction in the lecture format, in part for the same reason. The upcoming paper is our last to advocate modifying the lecture itself. It will be followed by several that enhance the lecture through activities in the associated lab or discussion groups.


22.3 : Paper 11, Peer Instruction


C. Crouch and E. Mazur, "Peer Instruction: Ten years of experience and results," Am. J. Phys. 69(9), 970-977 (2001)


Actively engaged students learn more than passive receivers of knowledge. Cooperative activities are an excellent way to engage students. Paper 11 presents the results of ten years of Peer Instruction at Harvard University. The courses are a mix of algebra-based and calculus-based introductory physics courses for non-majors. Peer Instruction has been adapted to a wide range of contexts and instructor styles. Over the ten year period, Peer Instruction has been refined. The 3 major changes were: (1) replacement of reading quizzes with warm-up exercises of the Just-in-Time-Teaching (JiTT) strategy [ISBN 0-13-085034-9], (2) use of a research-based mechanics text, and (3) use in discussion sections of Tutorials in Introductory Physics (McDermott, et al.) and of group problem-solving activities (Hellen, et al.). Peer Instruction is a structured questioning process that involves every student in the class. It divides a class into a series of short presentations, each followed by a related conceptual question (ConcepTest). Students are given one or two minutes to formulate individual answers and report their answers to the instructor. If the percentage of correct students is between 30 and 70, which it normally is, then the students discuss their answers and the underlying reasoning with each other. After these two-to-four minute discussions, the instructor polls students for their answers, explains the answer, and moves to the next topic. ConcepTest questions are part of midterms and finals. If the percentage is less than 30, additional instructor-centered teaching is needed; if greater than 70, moving straight to the next subject is the most effective use of time. Time is an issue. In the algebra-based class, approximately 15% of the old traditional curricula is not covered. The calculus-based class does cover all of the old curricula. To free-up class time, students are required to read prior to class. This was enforced by a beginning class reading quiz; it is now enforced by a three question web-based assignment due before class. This pre-class information is used to focus the lecture on student identified needs.

Student knowledge is assessed by FCI, MB, traditional exams, and ConcepTest performance. Conceptual Mastery, as measured by <g>, has risen from 50% to 78% over a six year span for the calculus-based classes and has averaged around 62% for the algebra-based classes over the last couple of years. This is in comparison to Hake's reported average for IE classes of 48%. Problem solving has been de-emphasized in lecture, with students learning these skills in discussion section and via homework assignments. Still, quantitative problem solving skills, as measured by the MB, have risen from 72% to 79% for the calculus-based class and are around 66% for the algebra-based class. In a comparison of Peer Instruction versus Traditional back in 1990 and 1991, <g> jumped from 23% to 50% in one year, and MB scores increased from 66% to 72% in the same year. Also during those two years, common exam problems were given and a statistically significant increase in student quantitative problem solving skills was found. [Note: paper four states the MB's "main intent" is to assess qualitative understanding even though it looks like a conventional quantitative test.]

Three results stand out from analyzing student responses to all of the ConcepTests over an entire semester. First, 40% of answers were correct both before and after discussion, 32% shifted from incorrect to correct, 22% remained incorrect both before and after discussion, and 6% changed from correct to incorrect. Thus after discussion 72% of the answers were correct, and 28% incorrect. Prior to discussion 46% to 54%. Second, evaluation of semester testing shows that students retain real understanding of the concepts. Third, the strongest students are challenged by ConcepTests with no student getting more than 80% correct prior to discussion.

The last part of the paper is Implementation, with subsections: Reading Incentives, Cooperative Activities in Discussion Sections, Quantitative Problem Solving, Student Motivation, ConcepTest Selection, Time Management, Teaching Assistant Training, and Resources. Most sections are two or three paragraphs in length. Some highlights not already mentioned include: (1) approximately one-third to one-half of class time is spent on ConcepTests, (2) teaching assistants are required to attend lecture, (3) there is a web site, http://galileo.harvard.edu, which includes over 800 ConcepTests, and (4) student evaluations and attitudes are not measures of student learning.

Obviously, after ten years there is a lot here. Two points I wish to highlight at this time are the sensitivity PER has in regards to quantitative problem solving and to content-coverage. More than a few I.E. methods have been poorly received by traditionalists in part because some I.E. students can talk-the-talk but not walk-the-walk when it comes down to getting a numerical answer to a real-life question, and in part because most I.E. students have "not been exposed" to certain subjects such as buoyancy due to the time-intensive nature of I.E.. Interactive Engagement exchanges breadth for depth, arguing that mere "exposure" is worse than useless as it wastes valuable class time. Traditionalists argue that to not teach buoyancy and its ilk is to reduce introductory physics to mechanics. Leaving aside these deep and unresolved issues, the following papers propose that the lecture can be redeemed by activities outside of lecture, or at least propped up by them.


22.4 : Paper 12, Socratic Dialogue Inducing

Hake, who six years after paper 12 will father <g> for an apprehensive PER community, brings us an I.E. use for labs. Normally, lecture classes have attached to them labs and discussion groups. This trinity is usually what's meant by "Traditional lecture-based classes." Rather than mess with the large student group, the lecture, Hake focuses on a small student group, the lab. Much of I.E. is small teacher-student ratio oriented.


R. Hake, "Socratic Pedagogy in the Introductory Physics Laboratory," The Physics Teacher, Vol. 30, 546-552 (1992)


Socratic Dialogue Inducing (SDI) labs are simple Newtonian experiments designed to produce conflict between a student's common sense understanding and Newton's Laws. This conflict induces collaborative discussion among lab partners and/or a socratic dialogue with an instructor. SDI is an active-engagement method that is much more successful in transforming a student's thinking from Aristotelian to Newtonian then the usual bombardment-of-passive-students method. SDI is useful in large enrollment settings (~100 students) and was inspired by the empirical work of Arnold Arons. This reference describes SDI labs and procedures, gives examples, and presents some conclusions.

SDI labs emphasize hands-on experience in five manuals: #1 Newton's First and Third Laws, #2 Newton's Second Law, #3 Circular Motion and Frictional Forces, #4 Rotational Dynamics, and #5 Angular Momentum. These labs promote student mental construction of concepts through: (1) conceptual conflict, (2) extensive verbal, written, pictorial, diagrammatical, graphical and mathematical analysis of concrete Newtonian experiments, (3) repeated exposure to experiments at increasing levels of sophistication, (4) peer discussion, and (5) Socratic dialogue with instructors. SDI labs are: (a) adaptable to a wide range of student populations, (b) popular with students, (c) inexpensive in equipment costs, (d) easily modified, and (e) combinable with either other active-engagement methods or standard methods. Further, they allow instructors to discover learning problems and can provide valuable research data, if the dialogues and conversations are recorded and analyzed.

This reference goes into detail about SDI Lab Procedures. Each lab session has 24 students (4 students at each of 6 tables) with 2 Socratic dialogists, one of whom has had previous experience. There are five primary ground rules, each given several paragraphs of development. A short sketch of the rules follow ['you' refers to the students]: First, you must understand the material you work on rather than "cover" all the prescribed sections. Second, draw "snapshot sketches" with color-coded vectors. Third, justify your collaborative responses with a thoughtful explanation and/or sketches. Fourth, if you are confused, after serious effort, call in a Socratic dialogist. And fifth, handed-in lab manuals will be examined and deficient work must be corrected at the next lab period.

Hake provides five lab questions and a Representative Socratic Dialogue out of SDI Lab Manual #1. He highlights three of his newer lab manuals, including the Water Bucket Swing, the Old Spinning-Wheel-in-the Suitcase Trick, and the Cat Twist. SDI labs are effective in guiding students to construct a coherent conceptual understanding of Newtonian mechanics. This is due to: (1) the interactive engagement of students, (2) the Socratic method of instruction, (3) kinesthetic sensations that intensify cognitive conflict, (4) cooperative grouping, and (5) repeated exposure to the coherent Newtonian explanation in many different contexts. Hake concludes by noting that more research and development is needed, including moving some of the instructional load to computers.

Four points in regards to the above. First, the Socratic method, in brief, is a questioning process by which the dialogist "leads" and "prods" his student to insight and the correct answer without actually telling him either. To put it mildly, such a process can backfire, particularly if the student is at the cognitive level of desiring "the truth" from an authoritative source. Second, cognitive conflict is critical to several I.E. methods, most notably McDermott's Tutorials. Very basically, the idea is that students enter class with non-Newtonian beliefs that work for that student; many of these beliefs were created by the student from his life experiences. In order to replace one of these preexisting beliefs with a Newtonian one, the preexisting belief must be proven to be in obvious conflict with simple observable fact. Only then is the student receptive to accepting and using a "better" belief, the Newtonian. This too has its foundation in cognitive studies, far outside of physics-content and thus dealt with in Chapter VII. Also from cognitive studies is our third point, constructivism. The fundamental idea of constructivism is that students construct their own concepts, and that, at best, the teacher can semi-prepare the work site, and provide a few basic tools. Thus, for constructivists, teachers don't teach so much as facilitate the students learning process. The upcoming paper also advocates interactive engagement of students during the lab component of the lecture, lab, discussion-group triad. It is interested in the non-major and raises several issues such as the difference in instructional impact on men and women.


22.5 : Paper 13, Inquiry Experiences


J. Marshall and J. Dorward, "Inquiry experiences as a lecture supplement for preservice elementary teachers and general education students," PER Am. J. Phys. Suppl. 68(7), S27-S36 (2000)


The introduction of paper 13 is a literature review advocating interactive engagement classes. It notes that significant gains on the FCI are possible even with the comparatively small investment of using I.E. to supplement a Traditional class. Paper 13 argues that elementary school teachers should be taught in the way that they will teach. Further, it argues that elementary school students learn best via hands-on interactive methods. The literature review of twenty-seven references includes the important Resource Letter by L. McDermott and E. Redish, "Resource Letter PER-1: Physics education research," Am. J. Phys, 67(9), 755-767 (1999).

Convenience samples are widely used in PER, with true random sampling rare. In paper 13's Preliminary Study, students were divided based on whether they had or had not also signed up for the separate lab course. Over the two quarter Preliminary Study, the match of lab and non-lab to inquiry and non-inquiry was flipped. To further mitigate the effects of convenience sampling, students were compared by GPA, gender, and major. In the Comparison Study, the entire third quarter class did inquiry activities and was compared to algebra and calculus-based classes. This comparison was done with a McDermott-published, light-bulb-brightness, test question. A quantitative comparison question was not used; in a nod to Thaker et al., this was viewed as a shortcoming in the research.

The inquiry exercises were adapted from Physics-by-Inquiry, Amusement Park Physics, and suggestions in Aron's book, A Guide to Introductory Physics Teaching. The exercises did not involve explicit pre or post-testing; although the material covered was included in course exams. Students worked in self-selected cooperative groups of two to six people, and due to time constraints, several Physics-by-Inquiry activities were shortened. While the activities in this paper are not appropriate for young children, material developed by Elementary Science Study is appropriate.

There are constraints and limitations to the generality of results, most particularly from nonrandom (convenience) sampling and the small size of subgroups. Assessment was based on final exam grades, course grades, and an inquiry subset of midterm exam questions. Standardized tests such as the FCI were not used, although a few specific questions from published sources were used. One hour focus group interviews with volunteer students were also conducted. The results of the Preliminary Study are that women benefit from inquiry-based laboratory exercises at a statistically significant level; men do not. Gender plays a more important role in determining inquiry benefit than does choice between elementary education and general education. Multivariate analysis of variance (MANOVA) and T-tests were used to arrive at this conclusion. The results of the Comparison Study are 26% of inquiry and 9% of non-inquiry students ranked the light bulbs, in McDermott's question, correctly and gave the correct explanation. Most of the student comments match those found in McDermott's published paper. The authors do note some real life problems such as irregularities in bulbs and inadequately charged batteries can lead to misconceptions. They now have TA's check all bulbs and batteries prior to lab. The particular misconceptions noted were that bulbs used up some of the current, and that batteries were constant current sources.

Interviews provided some insights. Inquiry sessions were most beneficial when they followed a lecture. The sessions helped narrow the preexisting distinction between teaching science and doing science. For a few students, the perceived lack of direction in comparison to strongly prescriptive labs, led to frustration and withdrawal. Most students found inquiry "dynamic", "exciting", and "alive". No cascading effect was found; the only differences between inquiry and non-inquiry groups were in those questions dealt with directly in the inquiry exercises. Thus, as many topics as possible need to be addressed via the inquiry method. Non-lab inquiry was performed during six one-hour lecture periods. Lab inquiry was performed during six two-hour lab sessions. Non-inquiry students were assigned extra homework, or if assigned to a non-inquiry lab, did normal prescriptive labs.

From one point of view, there is a lot less to this reference than meets the eye. The authors are doing MANOVA and T-tests on one class of students at one institution in their Preliminary Study, and comparing two classes on one question in their Comparison Study. From another point of view, paper 13 advances some important points. It's up front about convenience sampling, something to which most PER is oblivious. It uses interviews, although they were student-led. It acknowledges the reality of cooperative grouping in many circumstances, and stands in stark contrast to the rotating, random, four-person groups with internally rotating assignments such as recorder, critic, etc. that is the theory. The primary point of interest is the assertion that inquiry methods gender differentiate with women benefiting and men not.


22.6 : Paper 14, Supervised Practice

I'm going to label Supervised Practice as SP. The paper after this one is Studio Physics; let's label it StP. Which beats SP1 and SP2, or so I hope. On a more serious note, I can never trust my memory with "Supervised Practice"; I think of this paper as "White Board" for reasons you'll see soon enough.


M. Johnson, "Facilitating high quality student practice in introductory physics," PER Am. J. Phys. Suppl. 69(7), S2-S11 (2001)


Paper 14 investigates issues involved in facilitating high quality practice of the knowledge and skills that students are learning in introductory physics. A classroom peer-collaborative structure, Supervised Practice (SP), is described and critiqued. Experts and Novices approach problems differently. Experts often use complete accurate diagrams and start from fundamental principles. Novices use the skills they have previously developed: algebra and calculator use. Once a Novice has an answer, he's done. For an Expert, the answer must yet be checked for reasonableness in both magnitude and units. A goal of education is to develop in the Novice, the problem solving (approaching) skills of the Expert. Human tutors and, in certain settings, computer use, have proved themselves as viable methods in achieving this development. There are, however, cost and technical issues that encourage other approaches to this goal. Some effective, affordable, large-class structures exist that are able to provide timely guidance and feedback to groups of students while they practice expert skills for problem solving and concept interpretation. Small group problem solving is another approach that can be used to promote high quality practice among students. Timely feedback, including correctness of solutions, is critical to student learning. In supervised small group practice, a given student can get feedback from either the teacher and/or his several group members, swiftly. Additionally, a student also gains educationally by giving feedback. Finally, a teacher can communicate to several people who share a common problem at a common time. The supervising teacher can also focus on process, praising a well-labeled diagram or encouraging students to demonstrably check their answers for magnitude and units.

There are challenges in implementing Supervised Practice. These include: (1) peer-peer communication difficulties, (2) student-instructor communication difficulties, and (3) shifts in attention that hinder group coherence. Physics requires a special vocabulary and special diagramming techniques which a student must learn. Students face difficulties communicating in the language of diagrams and algebraic symbols. These difficulties erode the effectiveness of communicative attempts at providing useful guidance and feedback. The effectiveness of instructor feedback depends on both the level of understanding the instructor has about what the students have done, and the level of shared understanding the group of students have about what they have done. At the extreme, if an instructor does not understand what the students have done, he is reduced to working the problem for the group. If students don't share an understanding, the instructor must address each individual separately. SP directly fights against the common problem of students using tools they already know to quickly get homework answers; students thereby fail to develop the expert problem-solving skills so beneficial in real world contexts. People have different attention spans; they also have different abilities to discriminate between important and peripheral issues. Thus, it is probable that individual students will focus on different aspects of the solution and will not follow the details of the group's solution as it progresses for the 30 minutes needed to do the average homework problem. General strategies that address these three difficulties are creating a classroom environment that (a) facilitates effective communication between peers about the details of the group problem-solving process, (b) allows the instructor to clearly see what the students have done in the process of generating the group's solution, and (c) provides a semi-permanent record that allows students to see what has happened when they tune back in. The tool that allows easy achievement of this general strategy is the white board [about 18" x 24" is a good size]. As the single solution is written out, it is a common, easily seen representation of the group's work; this facilitates both inter-group communication and instructor feedback. Its semi-permanence accommodates variations in student attention spans.

Supervised Practice integrates the interventions proposed above and was implemented at Carnegie Mellon University from 1993 to 1996. SP met twice a week for 50 minutes each, in groups of twenty-five students. The structure of the introductory physics course also included lectures (60 to 240 students) on Monday and Wednesday, and exams or quizzes on Friday. There was access to a drop-in center several hours each day. Reif's research-based text, Understanding Basic Mechanics, was used. SP practical points follow. (1) SP meetings are staffed by two instructors drawn from graduate students, upper-class undergraduate majors, and volunteer post-docs. (2) The instructional staff meets weekly. (3) There is a "group-record" on which attendance, preparation, and progress are recorded. (4) Students do not hand in completed problem solutions. (5) Instructors are encouraged, but not required, to assign students randomly to collaborative groups. (6) The enforcement of roles within groups was the subject of much debate and ultimately left to the instructor's discretion. (7) Only one pen is given out per group, with explicit instructions to pass the pen after each problem. (8) Students are required to attempt a subset of homework problems prior to each SP meeting. (9) The TA checks that each student has attempted the subset. (10) SP meetings start with a warm-up problem. The students work on it for the first five minutes, then the instructor presents a complete solution and explanation on the blackboard. (11) Instructors encourage students to talk about the solution process and to write all the details of the solution on the white board. (12) After completing each problem, each group is required to discuss its solution with an instructor, at a "check point". This check point facilitates student-instructor communication. (13) No solutions are collected nor graded. The TA does note problem completion on each group record during the checkpoint process.

Paper 14 provides a two page illustrated example of SP. The example focuses on student interactions during the diagramming of an Atwoods machine problem. The example came from a real classroom interaction which was recorded by an observer in 1995. It is interesting to read, for flavor and authenticity. The main conclusion drawn is that high quality of practice can be achieved in collaborative groups when students can communicate effectively with each other. The white board facilitates communication between students and between the student group and the instructor. It serves as a powerful memory device to facilitate the asking of questions between students, and it facilitates spontaneous review and summary between students. The white board is a powerful tool in the implementation of collaborative problem-solving practice. Other strengths of SP are: (1) problems are from a research-based course text; (2) closely related material is on the quizzes and exams; (3) close coordination between SP and the lecture, homework, and assessments; (4) students like it; (5) it's transportable to other institutions (University of Oregon); and (6) while several other learning features impact this, the FCI scores go from pre = 64% to a post = 84%, with a <g> of 0.55. Challenges are: (a) participants can be unfamiliar with, or discomforted by, their new roles in peer collaboration; (b) there is a slight cost increase, mostly to pay undergraduate TA's for their 12 hour-a-week assignments; and (c) instructor training is very important as graduate students are not necessarily experts in course material, pedagogy, and the management of a collaborative learning environment. This reference also highlights the Instructor Development at the University of Oregon. Part of this instructor development was a required one hour weekly staff meeting. Instructors did the problems beforehand and brought their solutions. Instructor physics-content knowledge, student difficulties with physics concepts, student difficulties with problem-solving processes, and student difficulties with peer interactions were also addressed at the staff meetings.

The above paper was fairly normal, if more verbose than usual, on Instructor Training. As another example of TA training, paper 11 notes a requirement for TAs to attend lecture; I was required to do this as an Astronomy TA and actually enjoyed it. Not only does attending lectures help train the TA in content, it also allows the TA to know what his students have been exposed to in lecture. This knowledge helps in both content instruction and social interaction between the TA and the students. It also closes several possible communication gaps between the TA and the professor. Drop-in centers under various names are an often used but little commented on part of published I.E. methodologies. The above paper's acknowledgment of non-content goals is also common and dealt with at length in the upcoming epistemology section (papers 30 through 35). The most unusual aspect of this paper was its focus on problem solving skills.



CHAPTER 23 : Interactive Engagement Methods that Replace the Lecture


23.1 : Paper 15, Studio Physics

The above six papers all retained the lecture to a large body of students as part of the courses overall methodology. The upcoming papers dispense with the large body of students and, largely, with the lecture. If teachers still talk a lot, students are talking more. Unfortunately this alone is not enough to enhance our student's knowledge of physics, as our next paper painfully acknowledges. The upcoming paper is unique in my reading, and valuable to establishing the credibility of PER. Too often published PER papers base a mountain of conclusions on a mole hill of data, as with paper 13, or arrive at refutable conclusions, as with paper 8. The upcoming paper is sadly honest, and, as such, a great boon. Examined failure is at least as great a teacher as success.


K. Cummings, J. Marx, R. Thornton, and D. Kuhl. "Evaluating innovation in studio physics," PER Am. J. Phys. Suppl. 67(7), S38-S44 (1999)


Introductory Studio Physics (StP) is Rensselaer's equivalent to the standard calculus-based two-semester physics course for engineers and scientists. The lecture and laboratories are integrated. Class size is 30-45 students. There is extensive use of computers. There is collaborative group work and a high level of faculty-student interaction. There has been no significant reduction in course content. Class meets twice a week for 110 minutes each. Studio physics uses traditional activities adapted to the studio environment and incorporates the use of computers. These activities do not directly address student misconceptions and employ neither cognitive conflict nor bridging techniques. There is currently no explicit training of TA's, and as a result there is great variation in their effectiveness. Unfortunately, the <g> is 0.22 for normal studio physics classes. This is at the same level as traditional classes despite the studio classroom appearing to be interactive.

There is a standard approach to Studio Physics, but a certain flexibility allows for some diversity. In an effort to improve both <g> and FMCE scores, ILD and Cooperative Group Problem Solving (CGPS) were incorporated into five experimental StP classes. These were contrasted with seven standard StP classes. Paper 15 has a page describing ILD and CGPS. The authors also took great care in specifying how their implementation of ILD and CGPS varied from the standard ILD/CGPS formats. In the case of ILD implementation, less time was spent on "analogies to similar situations" and "having the students discuss results" than the ILD Teacher Notes suggest. Furthermore, students were allowed too much time to make predictions, resulting in fast students losing interest and focus. The small studio classroom with non-sloping floors made it difficult for all students to see the demonstrations. Finally, the students are so collaboratively oriented that it was difficult to get individual predictions. This resulted in some students never making a personal intellectual commitment. Personal commitment is a key epistemological point necessary in creating an environment in which the mind can change its belief patterns. In implementing CGPS there were deviations from the five-step-process strategy. Some of these deviations were the result of StP structures common to all classes, experimental and standard; for example, no context-rich problems were included on common exams. Furthermore, only half the class periods were given over to CGPS, with the other half focusing on standard problems which would be tested. Three additional issues addressed were: (1) due to time constraints, the instructor did not model CGPS techniques as often as desired; (2) students were so strongly resistant to cooperative group roles (critic, recorder, etc.) that this aspect soon died out; and (3) because the CGPS five-step problem solving strategy is typically irrelevant to textbook-style homework, students resented having to use it; this was particularly true when they could solve the problems quickly and correctly without it.

The students enrolled without knowing whether the class would be standard or experimental. The tests were given back-to-back with twenty-five minutes allotted for the FCI and 35 minutes for the FMCE. Post-instruction testing was ten weeks later. The standard 110 minute classes began with 30 minutes of answering questions and working board problems. This was followed by roughly 15 minutes of lecture. In-class assignments filled out the remaining class time. Some instructors did allow students to leave early if the assignment was completed. The experimental classes were very similar to the standard classes with experimental activities replacing, not supplementing, part of the curricula. Four sections got all four ILD sequences; two sections got the entire CGPS package, with a one section overlap. The standard class results are <gFCI> = 0.18 ± 0.12 and <gFMCE> = 0.21 ± 0.05. The experimental ILD class results are <gFCI> = 0.35 ± 0.06 and <gFMCE> = 0.45 ± 0.03. The experimental CGPS class results are <gFCI> = 0.36 and <gFMCE> = 0.36.

Information is provided per section, per person, and per group. Per section information is provided in bar graph and table format. There is a minor labeling error in paper 15 with section 9 listed as either ILD or ILD/CGPS. ILD alone or CGPS alone work as well as the two combined, with ILD taking far less time. The authors do note that CGPS students "performed better" on the problem-solving section of the last course exam. There are scatter plots of individual FMCE pre versus post-scores. Fifteen percent of standard and three percent of the experimental students have a post-score up to 10% worse than their pre-score. However, there are impressive gains with thirty-four percent of standard and sixty-four percent of experimental students having a post-score 20% or more, better than their pre. From the scatter plots, it is evident that the weaker students benefited from the experimental curricula. To determine the effect on the better students, the students were divided into thirds based on their pre-instruction test scores with <g>s figured for all groups. The top third of the experimental group had the highest <g>s. The top third of the standard group had the lowest <gFCI>, if not quite the lowest <gFMCE>. The authors conclude that it is necessary to mentally engage students, and that small classes, cooperative groups, and computer availability are good but insufficient. They emphasize that it is equally important to use research-based questions and activities.

Upcoming papers (30-35) also discuss cognitive conflict and bridging techniques. Paper 1 lists the average FCI taking time at Harvard as 23 minutes. I believe twenty-five minutes is not enough time to give the FCI, as little more than half of the students will have time to complete it. The listing of ILD implementation mistakes provides practical counterpoint to paper 9. I am curious as to whether any class switching by students after enrollment occurred and if so in what direction. Computers and convenience grouping rear their ugly heads but will be addressed elsewhere.


23.2 : Paper 16, Integrated Math, Physics, Engineering, and Chemistry


R. Beichner, L. Bernold, E. Burniston, P. Dail, R. Felder, J. Gastineau, M. Gjertsen, and J. Risley, "Case study of the physics component of an integrated curriculum," PER Am. J. Phys. Suppl. 67(7), S16-S24 (1999)


The Integrated Math, Physics, Engineering and Chemistry (IMPEC) curriculum is a four-year experimental program whose goal is to minimize attrition and improve both student understanding and student attitude. Many different research-based approaches were used to modify the first semester physics course. These included activity-based pedagogues, collaborative learning, integration of curricula, content-rich problems, and the use of technology. In addition, close attention was paid to student-student and student-instructor interactions, with the goal of arranging classroom layout and usage to facilitate a student-centered learning environment. The instructional environment is described in two pages of detail with a couple of examples. All the classes met in the same room which was open 24 hours a day. The students were assigned to three-person teams. Team roles (recorder, checker, coordinator) and protocols were explicitly taught and were reinforced via grading schemes. Women and minorities were paired within teams and a variety of seating arrangements tested. Lecturing was minimized. Class time was spent doing hands-on activities adapted from existing curricula like Workshop Physics, Physics by Inquiry, ConcepTests, ALPS worksheets, and the simulation engine: Interactive Physics. In spite of some distractions, continuous accessibility to computers with MBL interfaces and software added enormously to the classroom milieu. In addition to computers, assigning students responsibility to read their textbooks outside of class, freed instructors to move about the classroom and enter into Socratic dialogues. Quizzes on the text and end-of-chapter homework ensured that students took the reading responsibility seriously. Labs were conducted as short exercises interwoven with the discussions. Labs ranged from ten minutes to several hours in length and were often initiated by a student question. These planned activities often appeared spontaneous to the students, and the instructor found that an excellent student motivator was for students to never know what to expect in class. The strong group ties cultivated through IMPEC led students to quietly "check with their neighbor" before turning to the instructor for information. In the process of constructing their own understanding, students increasingly challenged each other and their instructors.

In evaluating and assessment, IMPEC students were compared to a control group of students who had volunteered for IMPEC but had missed out in the semi-random draw. IMPEC membership was constrained to match the gender and minority status of the entire engineering freshman student body. There was both qualitative and quantitative assessment. The qualitative assessment included: (1) analysis of Listserv e-mail via Strass and Corbins Grounded Theory, (2) use of student work and direct questionnaires, and (3) more than two hundred hours of field notes and videotape of teacher-student and student-student activities. The quantitative assessment included the FCI, the Test of Understanding Graphs in Kinematics (TUGK), and calculation-oriented traditional class exams. Both experimental and control group students took a common final exam.

Qualitative data highlights the critical importance of socialization in the classroom. The major student use of the Listserv was for socialization messages (38%). Encouragement and socialization made up 24% of faculty e-mails. Field notes and videotape reveal quite apparently, that the same students, in the same room, working in the same groups, responded differently to different teachers. Student-faculty interactions depend greatly on the personalities involved; although difficult, it is possible to improve these interactions. In an example of instructor-instructor learning, the importance of providing ample wait time for your questions is highlighted. One instructor answered his own questions too quickly, resulting in passive class behavior by an otherwise aggressive class. In experimenting with classroom layout, large circular tables without computer monitors were found to most facilitate group dynamics. No real determination of students per computer was made except to note three on two did not work. Students showed extremely high satisfaction on course evaluations with all but one student rating IMPEC 5 out of 5 on a Likert scale.

Quantitative results focus on passing, Likert confidence levels, final exam scores, TUGK scores, and FCI results. Passing was a grade of C or better in five courses, calculus I, calculus II, chemistry, physics, and engineering. IMPEC versus Traditional overall passing rates were 73% versus 51%. For female students, passing rates were 63% versus 44% and for minorities, 100% versus 20%. Likert confidence levels were up in all classes for IMPEC students and down for Traditional students. While positive attitudes are valued, engineers solve problems. On shared exam problems, IMPEC students averaged 80% with Traditional students averaging 68%. On the TUGK, IMPEC students in comparison with peers at other institutions (89 ± 2)% versus (48 ± 2)%. <g> for IMPEC in 1996 was 0.42 ± 0.06, for IMPEC in 1997 0.55 ± 0.05. It's noted that the <g> results are instructor independent. The same instructor teaching a traditional class in the same year had a class <g> = 0.21 ± 0.04.

In a follow-up, no significant difference in standard exam performance was found between previous IMPEC students and their traditionally taught peers during a subsequent traditional E&M course. The authors believe their most important finding is the central role socialization played in the success of the IMPEC students. No social barriers based on race or gender were found after careful scrutiny. Technology has a central role in IMPEC. The assignment of computer-related tasks brought everyone back on focus; in this variant of peer instruction, students became involved because they had to interact with the technology. IMPEC class size was roughly thirty-five students. The authors are in the process of scaling IMPEC up to class sizes of 100 students. In two last notes, the authors assert that the focus must be kept on the phenomenon being studied rather than on an authority talking about the phenomenon and that students respond positively when many instructors use many different techniques in a fairly short time period.

This is still another paper that emphasizes reading outside of class. It's common for I.E. classes to offer depth in the classroom and to make an attempt at breadth outside of the classroom. "Socratic dialogue" brings up echoes of paper 12. The use of the FCI as a quantitative results measure is highly unusual, particularly given its focus on natural language and concepts. The MB is far more common to be labeled quantitative, even though paper four explicitly states that, "the main intent [of the MB] ... is to assess qualitative understanding although it looks like a conventional quantitative test." Still, the excellent retention percentages for women and minorities and its good <g>s make paper 16 a valuable contribution to PER. Its highlighting of socialization as critical to student success is all but unique. Unfortunately, this reference's comment in regards to there being no significant differences between ex-IMPEC students and their traditionally-taught peers in subsequent courses is all too common. While some I.E. benefits could easily be course-specific, one hopes some benefits are generalizable and useable by students in future situations without a continual support structure provided by an instructor. Socialization would have seemed to fit the bill. It's sobering that even socialization fails to be a self-sustaining tool for enhancing future learning.


23.3 : Paper 17, History and Philosophy of Science

The upcoming paper is incredibly laden with theory. It uses high school physics classes for its data, and, while HPS is a method that does work in a lecture format, its application in this paper is decidedly lecture excluding.


I. Galili and A. Hazan, "The influence of an historically oriented course on students' content knowledge in optics evaluated by means of facets-schemes analysis," PER Am. J. Phys. Suppl. 68(7) S3-S15 (2000)


[This study] tries, by means of elicited structural components of students' knowledge to infer the influence of a historically oriented instruction in optics on the content conceptual knowledge of students in this science domain.


Paper 17 provides a full page of theoretical background on the structure of knowledge in learners, most of it historical and developed by non-physicists. The authors present Mach, Bruner, Piaget, di Sessa, and Minstrell. The first two argue that people are unable to grasp, remember, or manipulate, a huge amount of complex content without knowledge of structure. Piaget is the founder of constructivist theory, which "pursues picturing human cognition by its elements related in schemata." Di Sessa, the father of p-prims, argues for the existence of stable cognitive constructs spontaneously created in the form of fundamental self-explanatory patterns. Minstrell advance facets-of-knowledge as the means by which students understand particular physical settings. Facets are more context specific and thus less fundamental that p-prims. Facets-of-knowledge can be grouped in clusters. The core which underpins the facets is more inclusive and less context dependent. Thus, the core constitutes a scheme-of-knowledge. Schemes relate concrete entities and evolve in the course of formal learning. Schema do not require mutual consistency. The impact of instruction can be judged by which schema are held by students and the prevalence of those schema. Given the great versatility of naive conceptions, instruction should aim at the essence, the learners' schema.

Scientific treatises written at the dawn of science are good examples of claims and accounts about regularities observed in specific situations (facets) which were later represented by an inclusive proposition (scheme-of-knowledge). In the past, teaching physics by historical method had both proponents and opponents. Recently, the arguments in favor of HPS (history and philosophy of science) use in science instruction have been strengthened. First, presenting "unsuccessful" attempts at conceptual development that nevertheless helped to attain present scientific knowledge shows students a realistic picture of the complex transformation of knowledge from old to new. Second, the historical conceptual difficulties overcome by scientists in the past are similar to those faced by learners today. The arguments employed by the great minds of the past can be reapplied today, helping the learners of today. Third, HPS exposes competitive ideas and subjective perspectives in science which humanizes science education and makes science appealing to a wider variety of minds.

Optics was the field chosen to test the HPS method of teaching science because: (1) current belief is highly anti-intuitive, (2) there is an impressive abundance of naive concepts, and (3) there is a rich 2500 year chronicle of optical conceptions replacing each other. The HPS course preserved the standard menu of regular curriculum topics. It interwove presentations on the historical growth of optics understanding with discussions about the nature and behavior of light. The authors chose the historical content to match known schema of students' alternative knowledge. One example from a small table matching "historical sciences" to "students' (mis)conceptions" is Al-Hazen's concept of vision matching to Image Projection Schema.

The subjects were in four 10th grade classes. There was a control group of three 10th grade classes. The students were from three types of schools and had four hours of instruction a week. A specially prepared textbook was used. Both qualitative and quantitative assessment was performed. The facets-of-knowledge (F-of-K) and the schema-of-knowledge were compiled from student questionnaires. Content assessment was over the standard optical curriculum material only. Problem solving was addressed in class but was not reported. The frequency of a facet or schema and the difference between experimental and control group frequencies is addressed quantitatively.

The Findings-and-Interpretations-of-Data section is subdivided into knowledge of vision, knowledge of the nature of light, and knowledge of optical imaging. Facets associated with scientific conceptions indicate learning and positive gain in knowledge but not complete acquisition. The data is presented in tables, backed-up with schematic reproductions and highlighted by student quotes. Under knowledge of vision, the nonscientific schema is "Spontaneous Vision Scheme" with its constituent five facets-of-(mis)knowledge. All five of these share the common misconception that vision is a natural phenomenon, lacking delivery of light (or anything) from the object into the observer's eye. As an example of facets-of-(mis)knowledge (F-of-(mis)K), #2 is: objects are observed when merely being located in the field of vision (and are not blocked). An example of a facet-of-(true)knowledge associated with the schema "Scientific Conceptions" is #4: vision is explained by the fact that light must leave the object and enter the observer's eye. #2 was the most common misconception out of five listed. #4 was the most common true conception out of six listed. Under knowledge of the nature of light, the schema are "Reified Light" and "Scientific Conception". The most prevalent (F-of-(mis)K) is #4: light is comprised of (many or an infinite number of) light rays which fill space. The most common facet-of-(true)knowledge is #4: light expands in the environment from objects with a decreasing intensity until it strikes opaque objects. Under knowledge of optical imaging the four schema are: Image Holistic Scheme, Image Projection Schemes, Scientific Conception (lens), and Scientific Conception (mirror). Image Holistic Scheme and Image Projection Scheme look similar and can only be distinguished through the mechanism of image transfer, a fact distilled from interviews. A F-of-(mis)K for Image Holistic Scheme is: Image is always formed and can be obtained on a screen (mirror); there it could be observed (afterwards). A F-of-(mis)K for Image Projection Scheme is: Explaining the image in a lens, students produce a diagram of a point-to-point connection of an object with its image by means of a single ray.

While the F-of-K's are interesting in their own right, the paper compares the frequency with which each is held by the control [FAc] and experimental [FAe] groups. The F-of-(true)K most favorable to the experimental group and least favorable to the control group is: when the lens is removed, no image is produced, the FAc=0, the FAe=79. There are a total of forty-six F-of-Ks listed. Twenty-three of them are matched to nonscientific schema, and thus are really F-of-(mis)Ks. The data is illustrated by twenty-nine reproductions of student sketches. HPS subjects are shown to have both a far higher frequency of valid scientific knowledge and a far lower frequency of invalid alternative knowledge schemes. The benefits are attributed to the HPS curricula, highlighted by a couple of examples from the Atomist's theory of Eidola and Al-Hazen's medieval theory of image transfer. In refuting those historical models, their descendants provide us today with a rich, elaborate, interesting set of methods to refute in turn the Image Holistic Scheme and the Image Projection Scheme. Historical models are neither too obscure nor too complex for students to learn; they help make physics courses more effective and attractive to a wider population. This method does require two things. First, one must elicit the structure of student knowledge. Second, this knowledge must guide the selection of appropriate historical content.

Having written the above, I feel the gem has been lost. Leaving aside the justifying theory and the assessment discussion, the basic message is this: the misconceptions of today's students match, exactly, historical science believed in its time by famous intelligent people (Aristotle). This historical science was refined into more correct yet still wrong not-so-old historical science by famous intelligent people (Al-Hazen). This refinement continues to the present day. The students' incorrect beliefs are not dumb, nor are the students alone in believing them. The only basic problem is that real life is very complex, intricate, and non intuitive. HPS shows how Al-Hazen corrected some of Aristotle's misconceptions and was in his own turn corrected. Using the insights and failures of great men and the wonders of time compression, you and I can walk this historical road and see both where the scientific community is today in its beliefs and why so many good ideas were left behind for ideas that were yet better if not yet perfect. Before moving on, I'd like to bring to the reader's attention a book review by S. Mahajan, "A Gold Mine of Teaching Ideas," published in The Physics Teacher, Vol. 39, page 512 (2001). This reviews Time For Science Education which combines physics with history, philosophy, sociology and education theory. The book's author, M. Matthews, uses a fascinating tale of French revolutionary politics and science to show the connection between ?2 and the acceleration due to gravity as measured in metric units.


CHAPTER 24 : Interactive Engagement Methods Potpourri



24.1 : Paper 18, Communicating Physics Through Story

That ended my third chapter, we started with assessment tools, papers 1-8. The second chapter was I.E. methods that retain the lecture, papers 9-14; third was I.E. methods that replace the lecture, papers 15-17. The upcoming four papers also focus on how to teach physics. This chapter is a bit of a potpourri, unified in that all four papers neither redeem nor replace the lecture, but rather ignore it. If anything, these papers flirt with what and why to teach as much as they do how. Still, there are very concrete, do-this recipes advanced in these works.


R. Stannard, "Communicating physics through story," Physics Education, 30-34 (2001)


Stannard's article details reasons why we should introduce modern physics to children and advocates the methodology of story telling. Paper 18 provides in its references a list of award-winning books written for children on the subject of modern physics, most of which were authored by R. Stannard. It's important to "get in quick"; society teaches all of us that anything associated with the name Einstein is accessible only to geniuses. To defeat this negative socialization, we need to hook children on modern physics before they learn that they aren't supposed to understand it. Some of the findings of relativity and quantum theory appear to defy common sense. Common sense was for Einstein that "layer of prejudice laid down in the mind before the age of eighteen." So the earlier we start teaching or exposing children to modern science, the thinner the layer of prejudice to penetrate; children's thinking is not yet "too set in its ways." An early familiarity with modern physics attracts young people to physics. The prospect of studying modern physics is the most influential reason why students choose to study university-level physics. The most important reason for getting in quick is that one finds radically new physics springing from the flexible minds of theoretical physicists at the commencement of their professional careers, in their twenties. This flexibility is a hallmark of a young mind and is increasingly difficult to retain as age advances. Yet the young mind must already know enough physics to appreciate what the outstanding problems of the day are. Gaining this essential background early, widens the window of having both knowledge and flexibility. Age will close this window, leaving knowledge without flexibility, all too soon.

Story telling is the primary method of imparting physics knowledge to children. This method is in accordance with modern cognitive development of children (Piaget, Shayer, Wylam) which labels children as concrete thinkers who think outwards from experience rather than as formal thinkers who extrapolate from a theoretical framework. Stories also have a rich historical validity, having shown over time that they interest and amuse listeners while providing a framework which allows accurate retrieval of knowledge. Story telling can carry serious messages and ideas, as shown by William Golding's Lord of the Flies and George Orwell's Animal Farm. Science can be conveyed by story telling. The classic example is Mr. Tompkins by George Gamow, which Stephen Hawking has declared to be a "great book". George Gamow was one of the founders of Big Bang cosmology. In Gamow's book, Mr. Tompkins finds himself in a fantastical world where Plank's constant is much larger and the speed of light incredibly slower than is the case in our world. Relativistic effects are thus a common factor of day-to-day living, as are quantum uncertainties. This entertaining and intriguing book also allows the more serious-minded reader to explore these ideas in greater depth through interleaved formal lecture extracts. Mr. Tompkins, published in 1965, was an immediate bestseller, popular with both the public and the professional scientists. In 1989, R. Stannard started the first of many books that bring modern science to children in both an accurate and a comprehensible manner. His latest book (2002) is Dr. Dyer's Academy in which scientific misconceptions are dealt with in a manner similar to how morality is approached in C.S. Lewis' Screwtape Letters. As one of the motivators for instructing the young, it is noted that in a survey of 250 twelve-year olds, 65% had no idea what a star was (i.e. a large distant sun).

Teaching the young is not normally given energy and time in university physics departments. Still, internships for undergraduates to "do real physics" are pushed by various institutions, particularly as a summer occupation. If teaching is the best way to learn, then some consideration should be given to undergraduate internships that employ the undergraduate as a teacher. The level of difficulty could vary from something as simple as reading Mr. Tompkins and other books to a second grade class one hour a week, to something as complex as serving as a high school physics teacher's TA one day each week. In any event, this wider socialization of physics through accurate but non-rigorous books is not limited to the young, and despite excellent reasons to start young, the old and late starters should have their opportunity as well. M. LiPreste, in "A Comment on Teaching Modern Physics," published in the Physics Teacher Vol. 39, pg. 262 (2001), states his positive experience with an appreciation course in modern physics. As texts, he uses Issac Asimov's Atom and Brian Greene's The Elegant Universe. For a night class that was useless for degree requirements, 20 students showed up out of pure interest.


24.2 : Paper 19, Linking the Domains of Mechanics and Electromagnetism

The basic idea in paper 19 has been implemented at several universities. For example, the nine classes of the junior year physics curricula at Oregon State University are: Static Vector Fields, Oscillations, One Dimensional Waves, Quantum Measurement and Spin, Central Forces, Energy and Entropy, Periodic Systems, Rigid Bodies, and Reference Frames. These stand in contrast to the traditional courses of the senior year curricula: Classical Mechanics, Quantum Mechanics, E&M, Statistical Mechanics, Optics, and Math Methods. This breaking and shuffling of old domains centralizes common principles that connect the old separate divisions into a continuous whole. The central principle of waves is used in the University of Oregon's Physics 351, the Physics of Waves, to connect the old divisions of mechanics, electricity, optics, and quantum. The upcoming paper connects Mechanics and E&M via several common principles.


E. Bagno, B. Eylon, U. Ganiel, "From fragmented knowledge to a knowledge structure: Linking the domains of mechanics and electromagnetism," PER Am. J. Phys. Suppl. 68(7), S16-S26 (2000)


Knowledge structure is the critical difference between expert and novice problem solving. Expert knowledge is organized around central principles; novice around external characteristics. The expert's advantage is that his organization allows the same knowledge to be used in different domains and in unfamiliar situations. Students, in contrast, have a difficult time organizing their knowledge around central characteristics. They often fail to distinguish between the general concept and its examples, and may, for instance, define potential energy as a function of height. This failure to distinguish between general concepts and their examples is exacerbated by introductory courses which divide physics into separate domains: mechanics, electricity, magnetism, optics, etc. As an example: the principle of conservative force is usually fragmented. The gravitational force, elastic force, and electrostatic force are all taught at different times and are not explicitly linked by instruction. Although, they are all merely examples of conservative force. An inter-domain organization of knowledge has several advantages. It reduces the load on memory. It enables students to become accustomed to difficult concepts by elucidating them from various points of view. It enables the student to employ the methods used in one domain to solve unfamiliar problems in another.

Paper 19 focuses on the conservative forces (fields), conservative forces which are proportional to 1/r2, various examples of these, and the conservation of mechanical energy. A general concept is characterized by its critical attributes. For example, quadrilateral has two critical attributes: "quad" meaning four, and "lateral" meaning sides. Each concept has sub concepts which are its examples: for example, "square" and "rectangle". The problem is that examples have additional attributes not critical to the definition such as the ninety degree angles in both the square and the rectangle. Teachers often operate on two false assumptions. The first false assumption is that the strong resemblance between several examples is readily identifiable by learners. The second is that learners can easily differentiate between critical attributes and non-critical attributes. People place new information in a hierarchy. As teachers, we should see to it that central principles are at the top and examples are lower. As time passes, the lower levels are forgotten; the high-level important information is retained.

The MAOF teaching package was developed over four years. It has vector fields and potential as its central concepts, linking mechanics to electromagnetism. Two schema of knowledge structures are illustrated; one is hierarchical; the other has feedback loops. During development, it was found that students totally confused the conservation of mechanical energy with the conservation of energy. Further, the conditions necessary for the conservation of mechanical energy were not clear to a large number of students. These and other confusions are cleared up by problem solving in different contexts and by the construction of concept maps. MAOF is a flexible review process for use in existing courses on mechanics and electromagnetism. The process uses a package of student workbooks and instructor transparencies for guided class work. Pre- and post-homework sets bracket the class work, preparing and amplifying the reviewed concepts.

Three general considerations come into play when constructing an instructional method. The first is to divide up the material into manageable units, avoiding divisions which are either too fine or too gross. The second is to decide which teaching-learning sequence to use in presenting general concepts. The third is to design a sequence of activities that guide the learner toward the desired knowledge structure. MAOF uses a bottom-top-bottom sequence to define general concepts. For example, a discussion on conservative forces can start by dealing with the force of gravity near the Earth's surface. This will lead to a definition of the concept in general, and will be followed by other examples such as Coulomb's force. MAOF uses a five step model to guide the learner toward the desired knowledge structure. The steps are: solve, reflect, conceptualize, apply, and link. Each step is given a paragraph description, with concept maps playing an important role in the last two steps. The first step was kept deliberately simple by use of standard, familiar, few-step problems. The four MAOF units, conservative and non conservative forces, 1/r2 forces, electromagnetic fields, and vector fields are each addressed via the five-step process. Relationships, whether as counterexamples or as special cases, are emphasized.

A diagnostic study examined the knowledge structure that students formed after a conventional physics course. This reference presents an example which probes the critical attribute of reference point in the concept of potential energy. The three questions of the probe reveal that approximately half of the students do not recognize contexts where the choice of reference point is arbitrary. An example is:


Statement 1: The point of reference for the calculation of electrostatic potential cannot be located on a positively charged object.


Correct Incorrect Explanation


... Student Response: Correct. It can not be located there because it must be at infinity.


In detailed interviews, eight students were asked to write down additional concepts to the general concept of potential energy and to note relationships by using directional arrows. In the example response, the student did not relate any critical attribute of potential energy, i.e. conservative force, reference point, conservation of mechanical energy, and work. The student did conceive the general concept, potential energy, as subordinate to one of its examples "Gravitational Energy" and focused on non critical details such as "m"(mass) and "g" (9.8m/s2). Evaluation of MAOF used a test provided as paper 19's Appendix A. The test investigated attributes of the general concepts, attributes of examples, judgment of unfamiliar cases, and the conservation of mechanical energy. The two general concepts addressed are potential energy and 1/r2 forces. In part one of the test, six critical conceptual attributes are tested. Four of these attributes showed statistically significant improvement from pretest to post-test. In part two, students distinguished between critical and non critical example attributes at a statistically significant level for the four examples. In part three, only about 30% of the students were able to ignore the many non critical similarities and realize that Gauss' Law is in fact not applicable on the pretest. This nearly doubled to 60% for the post-test. In part four, the use of energy instead of cumbersome kinematics went from 30% (pre) to 70% (post). The pretest was given to twenty teachers. There were striking similarities between student and teacher knowledge structures, although all teachers had a B.S. in physics. After going through the program as learners, the teachers found MAOF very useful in enhancing their own understanding of the material. Other programs which facilitate knowledge organization include the efforts of Van Heuvelen and those of Leonard et al.

The expert versus novice labeling will be highlighted in the MPEX paper (paper 32). It's unfortunate, but practical that most reforms start as review. If reforms are both needed and effective, the obvious question is: why not teach that way initially? The talk of concept maps and knowledge structures is a bit more fleshed out in the paper, but the authors assume knowledge contained in their references. Still, their basic idea is sound; after all, we spend so much time on the simple harmonic oscillator in classical mechanics precisely to build a foundation model that will be applied again and again in many other fields. The simple harmonic oscillator is a "general concept" that physicists find quite useful. The admonition that the obvious distinctions between critical and non critical aspects of examples, is in fact not obvious, needs to be taken to heart. Asking several students to tell you what's important as in these eight interviews, is the quickest way to know you live in a different world than they. This, after all, makes a lot of sense. You've had years of instruction which they have not yet had. Hopefully, all that instruction did change you from what they are, to what you are now. A lot of instruction should result in a wide gap; bridging this gap between yourself and your students is the art of teaching. Finally, paper 1 asserts that a student is unlikely to surpass his teacher; this one only implies it. I hope students can and do surpass me. Otherwise, how could the human race advance over time? Some student must surpass his teacher; hopefully many do. How to get your students beyond your personal limitations would be an interesting subject to study.


24.3 : Paper 20, Physics Jeopardy Problems

Well, we've jumped from reading children stories to changing the instructional structure of a University Physics Department. Perhaps a middle ground would be of use. The upcoming paper offers a modest proposal, one easily implemented by an individual instructor.


A. Heuvelen and D. Maloney, "Playing Physics Jeopardy," Am. J. Phys. 67(3), 252-256 (1999)


Jeopardy Problems are problems in which the student works backwards from a given mathematical equation to a diagrammatical, graphical, pictorial, and/or word description of a physical process. There are also Diagrammatical and Graphical Jeopardy Problems, where students invent a word or picture description and a math description consistent with the given diagram or graph. Jeopardy Problems have many strengths, and they are easy to create. They can even be multiple choice. They prevent formula-centered, plug-and-chug, problem solving. They promote multiple representations where equations, diagrams, and graphs all become the same short story about life. Jeopardy problems highlight units which are the key to determining whether we are dealing with pressures, densities, accelerations, or distances. Because of their probable novelty, students will need practice before these problems are used on tests. Jeopardy Problems come in multiple levels of difficulty with this reference giving several examples from the easy, to the quite difficult. While this paper does not offer examples at this level, the authors note that "Richard Feynman's last blackboard [was]: Given S matrix, find problem."

Jeopardy Problems help the student develop qualitative understanding and help the students learn to use the symbolic language of physics. They accomplish this in part by preventing the means-ends analysis which many novice students use to do end-of-chapter problems. This reference offers eight examples of Jeopardy Problems with a discussion of possible answers. The direct impact of Jeopardy Problems on learning is difficult for the author to assess as they are used in conjunction with several other methods. The combined results are impressive with MB scores of 78% and FCI scores of 86% reported.


24.4 : Paper 21, Promoting Conceptual Change

Let's delve into a paper that PER detractors would love. It is part of published PER and is representative of the far end of the PER spectrum. It is also the last paper in our Potpourri section.


C. Kalman, S. Morris, C. Cottin, and R. Gordon, "Promoting conceptual change using collaborative groups in quantitative gateway courses," PER Am. J. Phys. Suppl. 67(7), S45-S51 (1999)


Students often hold views different or alternative to those which they will be taught in their courses. The students will not easily relinquish their original viewpoints because these viewpoints explain observations and required effort to construct. Conceptual change requires the students to critically examine their view of the world. For change to occur, students must make value judgments, rate ideas, and accept or reject material based on standards. Producing change thus requires evaluation, the highest ability in Bloom's taxonomy. Helping students initiate a growth process can easily span the entire course. Simplistically, there are two methods of problem solving: Template and Paradigms. The key difference is that students who compartmentalize knowledge and apply different templates to different knowledge subsets, lack the ability to apply principles garnered from one problem to an apparently different problem. Furthermore, even if problem solving methods change, knowledge acquisition methods are likely to remain compartmentalized unless critical thinking skills are developed. Finally, students not only have personal scientific concepts, but personal epistemological beliefs as well. Paper 21 provides many references to learning theory and philosophy.

Posner's learning framework for conceptual change is presented. Emphasis is placed on two points. First, students must know of problems with their personal scientific conceptions, usually via curriculum induced conceptual conflict. Second, the student must not compartmentalize his knowledge. People can hold contradictory beliefs. Replacement, not simple assimilation, is the teaching goal. Using a model based on proposals by Hewson, the authors attempt to produce change in four common personal scientific concepts. These concepts are: bodies of different masses falling from rest through a non-viscous media for a short time are found at later times to move at different speeds (concept 1), a fast-moving arrow stays in the air because of its great speed (concept 2), if a sandbag is dropped from an ascending balloon, immediately upon release, the initial velocity of the sandbag is zero (concept 3), and a ball, thrown in the air, is in equilibrium at the highest point in its motion (concept 4). Instruction must show both that the replacement concept is intelligible and that the personal concept is less plausible. The authors argue that the above is a reasonable strategy for younger students but is cumbersome. They argue that it is better to "get the students to critically analyze the two concepts and come to the realization that the personal scientific concept needs to be replaced."

The basic procedure to achieve concept replacement is a collaborative group exercise. Three or four students are assigned to a group and to individual roles within the group (reporter, critic, etc.). Students are presented with a demonstration or qualitative problem. They discuss it for a fixed time period. They then report. The principle is that there are at least two ways of looking at a problem non-judgmentally. Two groups with different concepts, report to the class. The spokespersons debate, and the rest of the students may ask questions. The opposing issues are clearly presented; then the class votes as to which concept resolves the demonstration or qualitative problem. Voting is essential to combat compartmentalization. The professor then resolves the conflict.

The test used to determine concept replacement was the FCI + 3. The FCI was used to norm; the three additional questions were specific to this study and are attached to paper 21 as its Appendix A. The treatment group was more successful in making conceptual change than the control group. Treatment group test sheets are provided as paper 21's Appendix B. For concept 1, the students had very high pretest scores so no inference could be drawn. For concepts 2 and 4, the treated group outperformed the standard group in a statistically significant fashion. For concept 3, no statistically significant difference between the groups was noted.

I included this paper for several reasons. One, it was in the 1999 PER Am. J. Phys. Supplement, and I desired to review the entire supplement. Two, it provides some good theory summations. Three, the idea of conceptual debate by student spokespersons is intriguing. Four, paper 21 is an example of sweeping judgments based on two classes worth of students on four questions, only two of which were statistically significant. And five, reading the paper will allow you to empathize, if not agree, with the viewpoints of some PER detractors. Were this reference alone, it could be dismissed as an aberration. Unfortunately, it is representative of its end of the spectrum. For example, consider D. Abbott et al.'s paper "Can one lab make a difference," published in PER Am. J. Phys. Suppl. 68(7) S60-S61 (2000). Abbot's published paper describes its author's use of one McDermott tutorial on one lecture section's worth of students in one university. The students were assessed by an eight question traditional quiz, which found there was a statistically significant difference on one question in favor of the tutorial group. The other seven questions were not statistically significant. The students were also assessed by six "Direct" questions on which the tutorial group outperformed the traditional group by 20%. The "Direct" questions are referenced to a Ph.D. thesis. All questions were multiple choice, but they aren't provided. At best, Abbot is simply reporting the results of a multiple choice test given immediately after instruction to a small group of students. Yet he provocatively claims this makes a difference. To whom? The authors of paper 21 have even more prestigious company. The authors of paper 1 end their venerable FCI paper with severe contortions. They spend the last couple of pages singing the praises of the "Wells Method" even though they admit there were "no overall improvement in gains via the NSF physics education project conducted by Wells..." I am presenting paragraph G of paper 21, in its entirety, as an example of the contortions resulting when desire prevails over logic:


Question 30 was the only question that especially addressed concept 3; the idea that if a sandbag is dropped from an ascending balloon, immediately upon release the initial velocity of the sandbag is zero. The fact that there was no statistical difference between the two groups in their improvements on post-test scores may have occurred because the groups were still not used to working together, but it is impossible to verify this. A more interesting explanation is that this also had something to do with the way the question was framed. The key point is, as pointed out earlier, that students lack the ability to apply principles garnered from a problem to an apparently different problem.15,16 Students may not recognize that the problem of a brick falling off the edge of a descending construction elevator (question 30 in Appendix A) is identical to the problem of a sandbag released from an ascending balloon. The premise of this paper is that the students' development of critical thinking is essential. This is the only way that students will not simply accommodate the replacement concept by compartmentalization of their knowledge. After the first exercise, the students had not developed their critical thinking skills and the different appearance of question 30 caused them to utilize their personal scientific concept instead of the replacement concept. This would account for the result that no significant improvement of the treated group over the control group occurs for concept 3 whereas significant improvements were observed for concepts 2 and 4. To test this idea in September 1997, in a two-semester course on physics for nonscience students, Dr. Kalman tried the following experiment: After the students had read about inertia in the textbook, but only as applied to horizontal motion, Dr. Kalman presented the sandbag problem. By vote the entire class without exception concurred that the sandbag would fall immediately without rising. The correct result that the sandbag would initially continue with the same speed as the balloon was then fully explained in terms of inertia. The students expressed themselves as delighted with the correct answer. Dr. Kalman then presented an experiment from the "The Video Encyclopedia of Physics Demonstrations"23 in which a ball was fired vertically from a "car" moving horizontally at constant velocity. The video asks where the ball will land: in front of, behind or on top of the "car" and then pauses. Fully one half of the class considered that the ball would hit the ground ahead of or behind the "car".


The key words are "it's impossible to verify this" and that pretty much is my summary of the whole work. A science is verifiable, and thus, paper 21 illustrates just how far PER has yet to go in transforming the art of teaching into the science of teaching. The upcoming four papers focus on computer usage.



CHAPTER 25 : Interactive Engagement Methods Highlighting the Computer


25.1 : Paper 22, Physlets and Just-in-Time-Teaching


W. Christian, "Educational Software and the Sisyphus Effect," Computing in Science and Engineering, May-June 1999, 13-15 (1999)


Today's physics education software is fundamentally different from that of even 1991. Historically, the software was platform-dependent and thus obsolete within eighteen months as the supporting platform (Apple II's, etc.) was eclipsed. Today's software is platform-independent being based on virtual machines, meta-languages, and open Internet Standards, and thus not subject to obsolescence on such a short time scale. Christian asserts that computers using commercial mass market technology such as Java and Java Script qualify under Hake's definition of and validation of Interactive Engagement (IE). The author distinguishes between media-enhanced and media-focused problems, and he introduces Physlets and Just-in-Time-Teaching.

Physlets are multimedia-focused problems in which the text does not give numbers. Observing an animation on the computer screen, the student must find the minimum speed, observe the motion, apply physics concepts, and with use of a mouse, measure parameters that the student deems important. Only then can he solve the problem. This requires the student to consider the problem qualitatively and prevents the "plug-and-chug" method of problem solving. To be truly effective, computer-assisted instruction must create a feedback loop between the instructor and the student. Just-in-Time-Teaching (JiTT) is one method of achieving this feedback. JiTT consists of short Web-based assignments which are due a few hours before class. The instructor builds an interactive lecture around the students' answers. Thus, students take part in a guided discussion that begins with their own preliminary understanding of the material.

Christian notes that technological advances do not necessarily improve learning. Two examples of this are watching video and using database techniques in attempts to tailor individual curricula for learners. He further asserts that virtual reality, 3-D modeling, and voice recognition are likely to have little impact without curricular development efforts. The author believes that for computerization to have a long-lasting impact on science education, it needs to be based on a successful pedagogy and not on the latest compilers, hardware, or algorithms. The FCI authors concur stating in paper 1 that "technology by itself cannot improve instruction." In an interesting historical note, the author mentions that post-Sputnik curricular reform material such as the Berkeley Physics Series is still available precisely because it was preserved in books and not subject to the hardware obsolescence that erased much of the early computer-based curricula reforms.

The above review brought to mind an interesting letter to the editor by D. Edmonds published in Am. J. Phys. Vol. 69(6) entitled "Troy Ounces (or Tons) of Silver." The letter is also about something made with great effort and obsolete shortly thereafter; in this case, 146,000,000 Troy ounces of Fort Knox silver were required to construct it. During WWII's Manhattan Project, one method to separate uranium 135 and 138 was by using calutrons. These large machines used the silver as wiring in huge electromagnets which were disassembled at the end of the war. "Fort Knox wanted its silver back," all 5,000 tons of it.


25.2 : Paper 23, Numerical Integration


P. Assimakopoulos, "A Computer-Aided Introductory Course in Electricity and Magnetism," Computing in Science and Engineering, Nov/Dec 2000, 88-94 (2000)


Electricity and Magnetism introductory courses expose students to new and complex concepts, such as: action at a distance, fields, potentials, and the superposition principle. In order for students to digest and assimilate these concepts, small groups of students must work out examples. These examples are typically those for which we have closed-form solutions. Today with powerful, inexpensive PCs and versatile software packages, we should no longer restrict ourselves to this small set of examples. Holding to the pedagogical approach that students benefit by doing everything for themselves as they thereby understand every step, the author advocates use of Microsoft Excel and Microsoft's Visual Basic Applications to perform numerical integration on both closed form (for comparison) and non-closed form (for realism) problems. The author gives examples using both the trapezoidal rule and Simpson's rule. He offers examples in: (1) electrostatic fields and potentials, (2) magnetic induction, and (3) visualization of electric and magnetic fields. He provides a very nice 3-D Excel graph of the electrostatic potential for a quadrupole as determined by a numerical solution of Poisson's equation. The author speaks to several issues in his Assessment section. Therein I found one of the more poignant statements in PER literature:


Some of the happiest moments in my teaching career came from the satisfaction expressed by students at such simple accomplishments as verifying results obtained from numerical integration of functions by comparing them with closed-form solutions.


25.3 : Paper 24, To Simulate or Not to Simulate?

Computers are used not only for calculation purposes, but also to out-and-out replace lab experiments. After all, if simulation is good enough for our National Nuclear Laboratories, surely it's good enough for us.


R. Steinberg, "Computers in teaching science: To simulate or not to simulate?," PER Am. J. Phys. Suppl. 68(7), S37-S41 (2000)


Paper 24 starts by giving some background references and by acknowledging that computer simulations often teach in fundamentally different ways than those methods scientists employ to discover knowledge. This reference compares three interactive learning classes, two use computers and one uses pencil and paper. The strategy and curricula of all three are based on McDermott's work at the University of Washington. The classes were introductory calculus-based physics at the University of Maryland. The "fraction of the possible gain" in the FCI scores (<g>) were 32% and 25% for the computer-based air-resistance tutorial classes and 32% for the pencil-and-paper air-resistance tutorial class. Air resistance was covered in one lecture and one homework problem. The tutorial pretest showed that while 80% of the students gave qualitatively correct graphs of position versus time for motion without air resistance, fewer than 10% could do so with air resistance. On velocity versus time graphs for motion with air resistance, only 9% indicated terminal velocities.

The lesson with and without computer use is described. Both the computer and the pencil and paper tutorials are essentially the same in content, coverage, and interactive engagement level. The significant difference is that the students use computers to verify their predictions in the first, and use more graphs and free body diagrams in the second. All students were diligent and engaged, but the second group had no means of "knowing" the answer and had to rely on consistency between their graphs and their diagrams. The first group seemed to rely on a more authoritarian viewpoint, that the right answer was something from the computer rather than something which the students had constructed themselves. A common midterm question is given along with the percent correct. Student performance was not significantly different across the classes, neither on the question nor on the test as a whole. The author sums up the research of others mostly pro-simulation. He does add the proviso that "running a computer simulation is very different than doing a physical experiment."

Paper 24 is another example of nothing posing as a lot. In all ways, it's worse than paper 13. Basically, we have two versions of the same commercially available tutorial being compared. One version uses computer simulation, the other, graphs and diagrams. The FCI is used to assess air resistance learning, and the result is charitably a tie: 32% to 32%. First, sample size is three classes. Second, this is not a comparison of simulation to real life, there are no actual experiments performed. Third, the FCI is not a valid assessment tool on the subject of air resistance. The FCI has one question on air resistance, with several others explicitly stating: "Disregarding any effects of air-resistance" (question 5); "When responding to the following question, assume that any frictional forces due to air resistance are so small that they can be ignored." (question 18). Implicitly, questions 16, 24, 25, and 26 all assume no air resistance. Only question 22 includes "the force of air resistance" and then only to acknowledge that air resistance affects the flight of a golf ball; question 22 does not even probe how air resistance affects this flight! The authors of paper 13 at least recognized that the FCI was not a valid assessment tool for their experiment. Paper 24 is mostly pro-simulation commentary based on references. The best that can be said about Steinberg's actual experiment is that compared to pencil and paper graphs and diagrams, computer simulations don't hurt. And if you choose to focus on the 25%, you couldn't even say that. In fact, given the invalid assessment tool, no one can say anything, and yet this paper is published in the American Journal of Physics. That was an ending, and a good one, but I want to comment on two other things. The Studio Physics authors (paper 15) realized that the number of people per computer is an issue that should not be conspicuous by its absence even though their only constructive comment is to not have a 3 person to 2 computer ratio. Then there is the issue of a server crash. Here at UCSC, our upper-division lab's computer system crashed for the last few weeks of class in the winter quarter of 2002. This resulted in over half of the students having to take incompletes and make-up the missing work in the spring quarter. Relying on computers is a double-edged sword, and when things go bad, they can go horrid. Our last paper focusing on computers is coming up. It is also the last paper focused specifically on how to teach physics content.


25.4 : Paper 25, Online Homework


S. Bonham, R. Beichner, and D. Deardorff, "Online Homework: Does it make a Difference?," The Physics Teacher, Vol. 39, 293-296 (2001)


The experimental fact is that good, conscientious, extensive, time-consuming, hand-graded, comments written on graded homework by a graduate student have NO observable benefit or determinant to undergraduate learning as compared to computer grading. Computers do have the advantage in that they can support pedagogues other than the traditional back-of-the-textbook physics problems used in paper 25's comparison. Methods of student assessment ranged from the FMCE, through quiz averages, all the way to time spent on homework per week as self reported. Students respond positively to using the computer for homework. This reference supports the viewpoint that technology does not improve or harm student learning, but rather that pedagogy is the critical issue.

Pedagogy will be addressed in Chapter VII. Next up, a small chapter focusing on what to teach, which I view as how-driven. Interactive Engagement methods require more time than Traditional lectures, and so we face the classic depth versus breadth problem. As I.E. methods chose depth for us, they are also choosing breadth against us. So, the question is what content to cut? Because everybody starts with a slightly different course content and a desire to be psychologically positive, the question is rephrased to be: what to keep? What to keep, creeps into what to teach? Pandora's Box opens. The majority of papers in PER which address the subject of what to teach, are actually advocating the inclusion of modern physics and electronics into physics' curricula. After all, qualitatively these subjects are quite interesting and fun, unlike algebra (otherwise known as mechanics). Further, why stop teaching physics at the 19th century, when we're in the 21st? In all this, what to cut quietly disappears, with each I.E. instructor deciding on his own to cut buoyancy. Which is O.K., in a way, except nobody gets around to teaching buoyancy, and thus, one old traditionalist has a stick with which to beat I.E. about the head and shoulders. My use of buoyancy here and earlier is an echo of the FCI (paper 1). It's authors end a paragraph in defense of problem 12 with: "Besides, some teachers might think physics students should know why things float!" The FCI authors, of course, definitely do not self-label as Traditionalists. Sadly, what to cut is a choice often made by an isolated teacher praying that the next course doesn't catch his students too much by surprise. Fundamentally, I.E. or not, our courses either have not enough time, (10 credit classes anyone?) or have too much content. Nobody is willing to publicly advocate throwing out buoyancy, although in private it must be common. I say "must be" in part because of the aforementioned FCI (paper 1) paragraph, but more so because of the huge choice of 12B over 12D on the old FCI. In paper 1, 12B = 949 students, 12D = 0, combined wrong distractors = 176 on post-instruction testing. It's not often the right answer is a null. It's even more unusual to accept two answers as correct on a MCSR test question as "SR" means "single response". The reason advanced to "allow" 12B as "acceptable" is simply nobody was picking 12D! So, if you missed my buoyancy allusions earlier, you're now in the know, how does it feel? To capstone the issue, by accepting two answers, question 12 on the original FCI made nobody happy and opened the test up to criticism. The revised version comes down on the practical side by getting rid of option D entirely [revised FCI question 29]. Thereby implicitly acknowledging that, in fact, NO, students do not need to know why things float! So, what else do they not need to know? It's the unanswered question in PER. Strictly by inference, I get the idea that students definitely don't need electives. We already know they should read outside of class (paper 11) and do homework problems at 30 minutes each (paper 14). What we don't know is how much reading or how many problems. The IMPEC room being open 24 hours a day (paper 16) brings it all together; one lives physics, one doesn't merely study it. ERGO we cut no subject, let us only add subjects, with any time savings coming from how we teach, not from what. Which brings us full circle in absurdities, because we have in our I.E. methods picked the time-intensive method of teaching physics!



CHAPTER 26 : Interactive Engagement Curricula


26.1 : Paper 26, Physics Goes Practical

The next few papers advocate including electronics classes in the curricula and modern physics in the introductory physics syllabi. These are good ideas, but there is no advice on what to cut out of curricula or syllabi to make room/time for these additions. Sadly, it's not required for authors of good ideas to designate what current practice must be cut to make room on the smorgasbord-of-physics for the up-and-coming hot new idea.


T. Usher and P. Dixon, "Physics goes practical," Am. J. Phys. 70(1), 30-36 (2002)


The applied physics option at California State University San Bernardino is distinguished from the traditional B.S. primarily by three courses: Introductory Electronics, Data Acquisition and Control, and Advanced Electronics. These courses focus on the analog electronic domain because digital courses are available through other departments. In Introductory Electronics, pre-calculus freshmen are taken from Ohm's law to constructing lock-in amplifiers in one quarter. In part, this rapid progression is the result of using LabVIEW and borrowing concepts from both "Workshop Physics" and "Studio Physics". It's also due to the amount of time invested by students with 6 hours and 40 minutes a week for ten weeks committed in a largely laboratorial setting. The option strongly pushes paid internships for a number of reasons. One reason is that such a program establishes good contacts with industry, thereby providing important curricula feedback. Finally, this program is part of a broad local effort to attract small high-tech companies to the San Bernardino area.


26.2 : Paper 27, Resource Letter: TE-1 Teaching Electronics

Paper 27 also addresses electronics, albeit for a different reason; it is a Resource Letter with a good introduction. Any electronics instructor will find it useful.


D. Henry, "Resource Letter: TE-1: Teaching electronics," Am. J. Phys. 70(1), 14-23 (2002)


The letter lists 253 publicly available resources, including: web sites, lab manuals, articles, textbooks, reference books, videos, and sources of equipment and parts, both new and used. In the introduction to the list of resources, the author strongly advocates an electronics course for future scientists. The focus of this course is different from that of an electrical engineering course. While scientists will not be designing stereo preamplifiers or reinventing the digital computer, they will be called upon to modify or combine electronic lab equipment and instrumentation in both standard and creative ways. Physics students who go into experimental physics or engineering will encounter overlapping generations of electronics equipment. They will research and work in environments that are electromagnetically noisy. Worse, both local and distant technical support for most electronics equipment is becoming minimal or nonexistent. In a related manner, the students will have to dig into undocumented software written in a multitude of languages. Not only will a formal electronics course help them with these issues, but it will also compensate for a world in which normal life no longer requires even a modest understanding of how electronics equipment works. Such a course would also compensate for the less-is-more trend in introductory physics sequences, which frequently sacrifice coverage of AC circuits and sometimes even of discrete digital components.


26.3 : Paper 28, Macroscopic Phenomena and Microscopic Processes

For those who followed my earlier commentary, the last sentence in paper 27's review makes for a good feedback loop (ha ha, bad joke). The upcoming paper is our last electronics paper before moving on to modern physics.


B. Thacker, U. Ganiel, and D. Boys, "Macroscopic phenomena and microscopic processes: Student understanding of transients in direct current electronic circuits," PER Am. J. Phys. Suppl. 67(7), S25-S31 (1999)


Paper 28 starts with acknowledging that most PER has focused on Mechanics, but that in recent years, Physical Education Researchers have turned their attention towards E&M. The authors propose to test Eylon and Ganiel's assumption that knowledge of microscopic models would enable qualitative reasoning about macroscopic E&M phenomena. They compare a Traditional class to one using Chabay and Sherwood's new text which emphasizes microscopic processes. A written questionnaire, very similar to the one used in a previous study of Israeli High School (Group HS) students, was given to 90 University of Ohio (Group A) students and 26 University of Michigan-Flint (Group B) students. Hour-long interviews were also conducted with 20 and 6 students respectively. Questionnaires and interviews were conducted during the last few weeks of the students' calculus-based introductory E&M course. Group A was taught in the traditional format of lectures, labs, and discussion groups. Group B differed only in that it was "based on a text that focused on qualitative reasoning, desktop experiments, well-constructed explanations and discussion with partners." Group HS was taught while models of microscopic processes were still being developed.

The five assessment questions are given in paper 28's appendix. Questions 1-4 are identical to the Israeli study. Question 5 was added to test student understanding of grounding. The authors' hypothesis is that: for students to have a solid understanding of transients in DC electric circuits, a model of microscopic processes is required. The spread between Group A (normal textbook) and Group B (reform textbook) ranges from 18% / 90% to 68% / 80% in favor of Group B, depending on the question. The correct written explanations of Group A versus Group B are also strikingly different, with B's being much longer and more specific. A's wrong answers illuminate four points. First, too heavy a reliance on memorization and math can be fatal. An illustrative example is given. Second, students erroneously believe that the order of elements matters in a series circuit. Third, students think that charge somehow has to jump from one plate of a capacitor to the other in order for current to flow. Finally, fourth: 7% of Group A clearly stated that they did not understand the relationship between charge and current. Group HS did significantly better on questions 3 and 4 than either group A or B. The authors credit in-service courses taken by the Israeli teachers for this difference, [for example: A = 7%, B = 41%, & HS = 64% for question explanation 4b]. Paper 28 offers percentages for both "correct answer" and "correct explanation". A common mistake for both groups A and B was that charges originated in the battery only; the roles of the conductor need to be given more attention.

Neither group A nor group B had instruction where grounding was specifically discussed. The A/B percentage correct ranged from 83/100 down to 13/14 on the different parts of question 5. This was an important and a discouraging finding. It indicates that even after explicit instruction, students do not use microscopic mechanisms when confronted by completely unfamiliar phenomena. This reference provides several discouraging student quotes, with neither Group A nor Group B discussing electric forces or electric fields in their explanation to question 5c. There was enough consistency in A's wrong answers to questions 5a & 5b to believe that many students do not understand "net change", and misunderstand "potential difference".

The paper offers a sample interview of a Group A student. Many of A's interviews were similar in that students would "search for replies that would utilize phrases (and even equations) they had encountered." They did not use a mental model of the physical situation, and some even failed to see when they were contradicting themselves. B's interviews were quite different. They would recognize inconsistencies and construct models easily. The authors conclude that models of microscopic processes should be introduced as an integral part of any E&M course. This is based on the assertion that "Group B students exhibited a superior understanding of the phenomena... including those which were less familiar to them, than Group A students."

Let's start the commentary with five small items and then discuss the big problem. First, an emphasis: groups A and B differ only in their required textbook. Group A used a "traditional text" and Group B used a "text that emphasizes models of microscopic processes." Second, placing paper 28 in this chapter is not ideal, but this is the best spot I could find to put it. Third, the best part of this paper is its identification of several common student misconceptions in E & M. Fourth, there is no pre-instruction comparison between groups A & B; thus, it is an assumption that post-instruction differences were the result of instruction only. Fifth and the last small item, Paper 40 notes that in Thermodynamics, students use microscopic processes to justify both science and misconceptions alike. Now to the big problem, the body of paper 28 contradicts its conclusion.

Question 5 is a four-part question, the last two parts ask questions on the never taught subject of grounding. The authors had hoped students taught via the microscopic process method [Group B] would be able to transfer their skills into an unknown domain [grounding] and outperform the students who were not exposed to microscopic process considerations [Group A]. The results for the four parts of Question 5 are: 5a - 83/100, 5b - 26/52, 5c - 13/14, and 5d - 13/17; all numbers are percentage correct with Group A first and Group B last. Question 5 itself follows for context:


5. Consider what would happen if, after the switch S had been closed for a long time, the capacitor was removed from the circuit (without touching the capacitor leads).

a) Would there be charge on either plate of the capacitor? What would be the net charge on the capacitor? Explain.

b) Would there be a potential difference across the capacitor? Explain.

c) If one plate of the capacitor were then connected to ground, would the charge on either plate change? Explain your answer.

d) If one plate of the capacitor were then connected to ground, would the potential difference change? Explain your answer.


The authors themselves acknowledge in the body of paper 28 the unhappy repercussions of the miserable tie in part 5c [13/14]:


In total, the answers of the two groups of students were not very different on this question. This is an important (and somewhat discouraging) finding, since it indicates that even when instruction emphasizes microscopic mechanisms (as done for Group B), students do not use these mechanisms when confronted with phenomena that are completely unfamiliar to them: transfer is of limited extent.


The authors, however, conclude paper 28 quite differently:


We found that Group B students exhibited a superior understanding of the phenomena and were better able to give valid explanations in a variety of situations, including those which were less familiar to them, than Group A students.


There is no "less familiar" category of situations. There are "taught" and "not taught" subjects, which are assessed by five questions. The last half of question 5 is the only assessment of the not taught subject [grounding]. The results 13/14 and 13/17 do not prove Group B to be "better able" than Group A in giving valid explanations to questions [situations] in subjects not taught [less familiar] to them. In fact, the one and four percent differences in parts 5c and 5d are not shown to be statistically significant, and given the sample sizes of 90 and 26 students, I'm unwilling to assume such significance. Worse, any advantage Group B may claim, cannot be attributed to the use of microscopic process considerations as the authors explicitly state that "students do not use these mechanisms" in addressing problems in areas not taught [completely unfamiliar] to them. So we go from groups A and B being "not very different" to group B being "better" through a linguistic shift from "completely unfamiliar" to "less familiar". A linguistic shift that doesn't exist as familiarly is never assessed [say by a pretest]. Knowledge is assessed on an untaught subject, in the hopes of documenting the use of a taught methodology. The methodology was not used. The results between groups on questions relating to the not taught subject grounding are a tie. Thus, in fact, Group B is not better able to give valid explanations in less familiar situations than Group A students, precisely because they did not answer untaught questions better than Group A students.


26.4 : Paper 29, Modernizing Introductory Physics


C. Holbrow, J. Amato, E. Galvez, and J. Lloyd, "Modernizing introductory physics," Am. J. Phys. 63(12), 1078-1090 (1995)


Paper 29 is about the experiences and lessons learned in the eight year reform of Colgate University's first-term calculus-based introductory physics course. The motives for reform were: (a) to improve student understanding of basic physics concepts, (b) to exploit modern technology, (c) to bring modern physics into the introductory syllabus, and (d) to increase student engagement. Paper 29's first section is a thumbnail sketch of reform literature with twenty-nine references. Some of these references offer a general history of PER, notably those by McDermott and Arons. Other references, notably those of Bethe and Feynmen, are very specific in advocating the inclusion of quantum mechanics and relativity into introductory physics classes. Quantum mechanics and relativity are advanced as fundamental to all contemporary physics. The reason Colgate changed was twofold. Positively, much of modern physics is significant and interesting, for example: lasers, the standard model, QED, chaos, high temperature superconductivity, nanotechnology, and the non-locality of quantum mechanics. All of this would attract students. Negatively, the past drove too many students out of physics; there was too much to cover, too fast, and worse, a lack of interconnectedness. The ideas of quantum mechanics and relativity are the fundamental interconnections of modern physics. Deciding to change is impossible if one must foresee how one change affects the entire curriculum and if one must predict the long-range consequences of reform. Rather, Just Do It! There are obvious and immediate problems in the way things are. It is better to attempt reform and deal with the consequences as they are experienced than to not.

Colgate was blessed by several advantages in getting approval for change. These include both a six-person faculty and no requirement to coordinate with an engineering curriculum. The basic reform strategy was to give the class a well defined story line. In general, this provides a useful guide for what to include and what to leave out. The story line is the connection between course topics. At Colgate, "Atoms" was the story line which incorporates: wave-particle duality, Heisenberg's uncertainty principle, quantization of light, and the existence of discrete energy states. Further:


By choosing atoms as our theme we could put off the conceptual complexities and subtleties of Newtonian physics for a term, and develop basic skills of quantitative reasoning in the simpler context of the physics of atoms - which up to the Schrödinger equation is largely the conservation of energy, the counting of discrete entities, and the superposition of sine waves.


This theme also allows a better match between high school physics (a prerequisite) and university physics. Students are daunted by reasonably rapid quantitative arguments, lack facility with scientific notation, do not follow simple ratiometric reasoning, are confused by scaling arguments, are intimidated by casual use of trigonometry, and, while "reasonably well-schooled in calculus," are clumsy and inefficient at simple algebra.

The four big changes to the Colgate course are: (1) considerable quantum and atomic physics were put into the syllabus, (2) these were also put into the laboratories, (3) a program of computer exercises was instituted, and (4) half the lecture time was replaced with small group recitations. The syllabus' major intellectual goal is to familiarize students with basic post-Newtonian ideas, such as mass-energy equivalence, discrete energy states, wave-particle duality, and the associated non-locality. To develop skills in manipulation and analysis of physical situations, the authors use relevant historical experiments such as Boyle's experimental determination of his gas law, Coulomb's verification of the inverse square law of force between point charges, on through Faraday, Millikan, Ulrey, Thomson, Möllenstedt and Düker, Bohr, Geiger and Marsden, and finally, Moseley's original published papers. The laboratory has ten experiments and two modern exciting lab-based computer exercises. The experiments use modern instruments and equipment: liquid nitrogen, electron fine-beam tubes, oscilloscopes, lasers, microwaves, spectroscopes, bubble-chamber photos, and radioactivity counters. The labs introduce unfamiliar ideas and phenomena by concrete examples. Electric and Magnetic fields, electrons, interference patterns, line spectra, superposition, means and standard deviations, and radioactivity are taught. Paper 29 goes into two and a half pages of detail. I hope you examine this reference; here are the lab titles: Molecular Velocities, Electrolysis, Electric and Magnetic Fields, Waves and Oscilloscopes, Waves: Diffraction and Interference, Bubble Chamber Photos, Bragg Diffraction, A Little Statistics, Radioactivity, and Spectrum of Atomic Hydrogen. The computer use emphasizes spreadsheets, which is a useful skill independent of a student's commitment to science. One scientific use of spreadsheets is in the Millikan Oil Drop Experiment where the sort facility exposes the discrete steps with vivid clarity. Less lecture, more recitation, has fostered better interaction between faculty and students, and has fostered more cooperation among students. Because of the small size of the recitations, more faculty are involved in the course allowing for increased suggestions and criticisms, a fact even further enhanced by all faculty attending each lecture! Finally, the feedback from students to faculty during recitation has resulted in cognitive overload avoidance, i.e. the syllabus was shortened, simplified, and even a few items eliminated.

In answering "How have we done?", the authors focus on their in-house exams, student evaluations, student retention, and faculty morale. The authors attached their 1993 final exam as an appendix to give a "sense of what we ask of students by the end of the term." The course grade average is around 2.75 out of 4.00. In evaluations by students, the results depend strongly on who is teaching and how much previous experience this person has had with the course [a result also noted in paper 16]. It takes about three years for a professor to achieve highly satisfied students. Satisfied students work; discouraged students engage in intellectual damage control by reducing their investment (work and time) in the course. There has been a sharp decline in mid-term dropouts and a significant increase in the fraction of students going on to the second term, since this reform occurred. Faculty morale is much better than the dark days of the first couple reform years. Faculty morale is so good that the authors have taken the second step. They are converting the second term physics course to the reform theme "Where we are in the universe, and why do we think so." In this second course, the authors show how physics has been used to discover the galaxy and the nature of stars, a rich context for using vectors along with a more matter-of-fact use of calculus.

Under "What we learned", humility stands as chief. Choice of theme is not nearly as critical as having one. Close contact with faculty in small groups helps students persist, where in the old format they might have given up. Teach less material. Have the laboratory in close coordination with the lecture. Emphasize basic quantitative skills and gradually increase their complexity. To add new material, something must be cut and in a traditional course that something is Newtonian Mechanics; you don't need as much Newtonian Mechanics as many teachers assume. And finally, if you're going to make changes in the first term, you'll end up needing to change the following terms as well. The level of effort, persistence and faith required to make change is very great.

There's a lot here that I like. The quote that modern physics is a simpler context in which to learn quantitative reasoning than is the Newtonian stands out, and ties in with the statements about student quantitative difficulties. Finding students' calculus skills better than their algebraic is a bit of a shocker, but is consistent with the authors' findings in regards to the even simpler issues of ratios and scientific notation. This in turn ties in with both their advice to ramp up quantitative skills over time and their advice to cut back on Newton. One reason to cut Newton is time; the authors prefer to invest a large time chunk into their labs. They raise the comment we've heard before that the lecture and lab should be closely coordinated; a desire expressed to me by many of my past lab students. The labs also emphasize the experimental, empirical nature of physics and add a beneficial historical flavor to the class, tying in nicely with paper 17. A main goal of the authors is student retention, a goal they share with the authors of papers 16 and 42. Too many students were leaving for chemistry, mathematics, and philosophy. Retention is complicated by only 30% of the incoming class taking physics because they like it. Those who like it credit their high school physics classes for their happiness. On the face of it, this is a good thing until you note that at this selective liberal arts college, they make high school physics a prerequisite for this class! They do note that even after their shift away from the "drink-from-a-fire-hose" traditional methods, the early versions of this class were still causing cognitive overload. To avoid this overload, they eliminated a "few" items, unfortunately they don't tell us what. They do attach an in-house final exam, to provide a flavor for the expectation level and subject coverage. They make a fairly standard nod to computers, particularly spreadsheets, but are quite nonstandard in their focus on faculty morale and in their requiring faculty (not graduate students) to attend lectures! It's well worth a read by anyone seeking to change the established ways, even if your changes or conditions don't specifically match theirs. Just Do It! Easier said than done of course. Having only six faculty members and no engineering curricula helped, but I was wondering about the biology tie-in. At my school, UCSC, well over half of the introductory students are premed/biology types. This ends the what to teach section. Upcoming is the epistemology chapter.



CHAPTER 27 : Epistemology


PER advocates Interactive Engagement teaching methods over Traditional methods. It does so primarily because of the imbalance of <g>'s in favor of I.E. classes. <g> is derived from FCI scores, and so entire curricula are judged on how well students learn the concept of Newtonian Force. Long term retention is generally not addressed, and medium term retention is usually disappointing. Still, in the short term, I.E. methods do increase student understanding of force as measured by the FCI. All I.E. methods exchange the traditional breadth of coverage for depth of understanding in fewer subjects. While this exchange keeps the total time learning physics a constant, any specific topic requires more time to learn. This time increase is the cost of the constructivist learning theory that underpins I.E. methods. Very basically, students learn better when they construct their own learning rather than when the teacher provides knowledge for the student to remember. Constructing your own knowledge takes time; it is also something a teacher cannot do for you. Still, at some level, teachers can facilitate the learning process. Thus our interest in epistemology which is the attempt to answer: How do people learn? Thus, we are stepping outside of physics and into educational theory. Having been a secondary school teacher for six years, I am personally aware of the multiplicity of contradictory educational theories advanced over the years. I am witness to the misapplication of theories and have read Howard Gardiner's comments on the misapplication of his theory of multiple intelligences. Further, I am acutely aware that the field of education lacks the equivalent of the experimental physicists with his all important role of weeding out and refining theories prior to their dissemination to the engineer/teacher level of societal use. Having said that, I do believe that there is science buried in here. Chemistry was born of alchemy, astronomy of astrology. Given the current work using MRI brain scans, I am hopeful that education will join the sciences. There is valuable knowledge here; where is the question. There are six papers in this section, none are original research in cognitive studies. All seek to apply cognitive studies to physics learning. There are no papers proving or disproving the epistemological theories themselves nor mention of competing theories to constructivism.


27.1 : Paper 30, Implications of Cognitive Studies


E. Redish, "The Implications of Cognitive Studies for Teaching Physics," Am. J. Phys. 62(6), 796-803 (1994)


Society has a great need not only for a few technically trained people, but for a large group of individuals who understand science. In order to reach our students, we need to pay attention both to how students learn and how they respond to our teaching. We must treat the teaching of physics as a scientific problem. Science is a continual dance between theory and experiment. A few physicists are performing detailed experiments to determine what our students are thinking and what works in the teaching of physics. Of these few, even fewer seek to place their results in a general theoretical framework. Such a framework is needed; collecting data into a wizard's book of everything is not science. While learning physics is much more complex than the processes most cognitive scholars address, the basic ideas of cognitive studies are useful to physics teachers. Redish offers some wide ranging references on cognitive studies in his paper. He groups cognitive ideas useful to physics instructors into four broad principles and elaborates them in corollaries. He warns that these will not be "hard and fast" rules and that they must be used in conjunction with experimental data. Fundamentally, the physics content we teach is important, but it must be viewed in the context of what our students learn, not of what we teach. The principles are: (1) Building Patterns, the Construction Principle; (2) Building on a Mental Model, the Assimilation Principle; (3) Changing an Existing Mental Model, The Accommodation Principle; and (4) the Individuality Principle.

Building Patterns, The Construction Principle (principle 1) can be expressed as: People tend to organize their experiences and observations into patterns or models. Mental models have the following properties: (1) they consist of propositions, images, rules of procedure, and statements as to when and how they are to be used, (2) they may contain contradictory elements, (3) they may be incomplete, (4) People may not know how to run the procedures present in their mental models, (5) elements of a mental model don't have firm boundaries; similar elements may get confused, and (6) mental models tend to minimize expenditure of mental energy; people will often do extra physical activities, to avoid some serious thinking. Two important aspects of principle 1 are (a) people must build their own mental models, and (b) students may hold contradictory elements in their minds without being aware that they conflict. Part (a) above is the cornerstone of constructivism, which at its extreme, states that you can't teach anybody anything. All you can do as a teacher is to make learning easier for your students. This demands feedback and evaluations from students to see what works and what doesn't. Constructivism focuses on our students learning, not our teaching. The author expands on principle 1 in over two pages and five corollaries. Simplistically, people learn better by doing than by watching someone else do. Students enter our classrooms with naive mental models they developed by living in a non-ideal friction-dominated environment. It's not enough for students to know the right information; access, cross-checking, and evaluation of relevance are all of equal importance. Evaluation of process is as important as evaluation of result, as the author shows in this quote:


I once asked a student (who had done a problem correctly) to explain his solution. He replied: "Well, we've used all the other formulas at the end of the chapter except this one, and the unknown starts with the same letter as is in that formula, so that must be the one to use."


To determine what our students know, we have to give them the opportunity to explain their thoughts in words. The last corollary to principle 1 is on the "meta" issue: many students do not have appropriate mental models for what it means to learn physics. Redish also warns those of us who love learning that the experience of lecturing and teaching is such a powerful learning experience for ourselves, we may not want to give it up.

Building on a Mental Model, The Assimilation Principle (principle 2) states that it is reasonably easy to learn something that matches or extends an existing mental model. There are three corollaries. First, it's hard to learn something we don't almost already know. Second, much of our learning is done by analogy. Third, touchstone problems and examples are very important. Touchstone problems become "queen bees' for new swarms of understanding." We spend so much time studying the mass on a spring, not because of any intrinsic interest, but rather because it is the touchstone problem for all kinds of harmonic oscillation from electrical circuits up through quantum field theory. The author also raises the whole issue of common versus technical use of a word; force is the classic example, with students often using it in a different sense than the instructor.

Changing an Existing Mental Model, The Accommodation Principle (principle 3) states that it's very difficult to change an established mental model substantially. Traditionally, we've relied on an oversimplified view of principle 1, the patterning principle, to say: "Just let students do enough problems and they'll get the idea eventually." Unfortunately, this doesn't work. There is a method that does work. A prediction must be made by the individual based on his existing mental model, and the following observation must be a clear and compelling contradiction. The proposed replacement mental model must (1) be understandable, (2) be plausible, (3) be useful, and (4) have predictions in strong conflict with the original model and strong agreement with observations.

The Individuality Principle (principle 4) is that since each individual constructs his or her own mental ecology, different students have different mental models for physical phenomena and different mental models for learning. The large standard deviations obtained in educational experiments are part of the measured results; they are not experimental errors. In our first look at what students are doing, it's very important to consider them one at a time, to interview them in depth, to give them substantial opportunity for thinking aloud, and to not give them any guidance. While later studies seek the frequency of various modes of thinking using large populations, many valuable studies in PER are done with very small sample sizes. The corollaries to principle 4 are: (1) People have different styles of learning; there is no unique answer to the question: What's the best way to teach a particular subject?; (2) As physics teachers are an atypical group, our personal experiences are a very poor guide for telling us what to do for our students; and (3) if we want to know the state of our students' knowledge, we not only have to ask them; we must listen to them. Redish concludes that if we are to make serious progress in reaching our students, we will have to shift our emphasis from the physics content we enjoy and love so much, to the students themselves and to their learning.

Paper 30 presupposes we need/want to teach real physics, not the appreciation of physics, to a high volume of students, or at minimum, teach an introductory physics sequence to a large volume of students. PER allows that a small number of students either (1) get through despite, (2) are unaffected by, or (3) actually benefit from, current traditional instruction. The point is to not gate keep the masses out of knowledge and into ignorance, but rather to bring the light of physics to all. Instead, I would argue for an approach similar to that pursued by the Art Departments, where it is accepted that (1) not everybody is an artists, and (2) non-artists can benefit from knowledge about art. Thus, the proliferation of such classes as Art Appreciation for the masses, while reserving such classes as Figure Drawing for the few. Physics classes should be similarly divided. Shifting gears, the comment about small studies is true; Hammer's Ph.D. thesis involved observing six physics students. There are epistemological theories other than constructivism, although they make no impact on PER. It is interesting to note that a strict reading of the Individuality Principle would argue that student misconceptions would be unique. A reading in contrast to the limited number of misconceptions found by researchers. A reading that undermines I.E. curricula reform as such curricula relies on the same widely shared and reoccurring misconception being corrected year after year.


27.2 : Paper 31, More than Misconceptions: Multiple Perspectives

The upcoming paper argues that student misconceptions are not just problems that need to be fixed, but contain opportunities which aid the instructor and the student in creating scientific ideas. There are those, notably McDermott (paper 47), who believe the idea is to get rid of misconceptions and then build on a cleaned work site. Hammer will argue for rearranging the existing structure into something that is scientifically useful. His paper is in part a reaction to the preexisting McDermott's "elicit, confront, resolve" mantra which explicitly views student misconceptions as problems not as opportunities. Unfortunately, we are presenting Hammer's and McDermott's work out of chronological order, another "straightened" strand of our PER web.


D. Hammer, "More than misconceptions: Multiple perspectives on student knowledge and reasoning, and an appropriate role for education research," Am. J. Phys. 64(10), 1316-1325 (1996)


Physics education research has not produced any theories with the precision, coherence, and stability that have been achieved in physics. It has, however, developed compelling evidence to discredit traditional methods and convictions. Unfortunately, many physics instructors remain confident in intuitions borne out of their extensive but unstudied experience as students and as teachers. "Like the intuitions physics-naive students have developed from their extensive unstudied experience of the physical world and in which they are confident, these instructors' intuitions are inadequate and often incorrect." While PER is not yet an applied science, it does provide perspectives that expand, refine, and support instructors' perceptions and judgments. PER helps expand the instructor's concentration to include not only physics content but also how the students interact with that content. Paper 31 uses a brief excerpt from an introductory physics class conversation as a stimulus for five possible reactions. These reactions are examples of five perspectives an instructor may have on the students' knowledge. The moment is real, and excerpted from weeks of video taken at a public high school in Massachusetts. It was chosen in part because the idea spoken to, that a force is needed to make the ball move, has a rich body of research associated with it and the substance of the students' comments is not unusual.

Multiple perspectives are important because they increase the conceptual resources of the teacher. What instructors perceive depends on their conceptual resources and influences how they think to intervene in student learning. What instructors notice influences their decisions on whether to slow down the presentation, which topics to cover, what problems to assign, and how to advise particular students. Conceptual resources include the instructor's knowledge of physics. The five resources based in PER are (1) misconceptions, (2) p-prims, (3) reasoning abilities, (4) epistemological beliefs, and (5) inquiry practices. There is a difference between conceptual resources and public articulated perspectives, a distinction without influence in paper 31. The excerpted class conversation involves Newton's Laws. Thus, an instructor's understanding of Newton's Laws is essential. Unfortunately, Newton's Laws concern only physical phenomena and with this resource alone an instructor is limited to perceiving student's statements as correct or incorrect. The nature of the students' knowledge has many shades of gray that this true/false perception ignores.

Misconceptions are strongly held stable cognitive structures that differ from expert conceptions. Misconceptions fundamentally effect student understanding of science and must be overcome for students to achieve expert understanding. This is in contrast to the idea that students are simply ignorant. For the instructor to simply transfer information is ineffectual. Misconceptions must be overcome before expert opinion will be accepted by the student who finds his current misconception both reasonable and useful. This overcoming is based on a process of drawing out explicit statements of the misconception from the students, confuting the misconception with arguments and evidence, and then promoting more appropriate conceptions.

p-prims' basic premise is that there are useful substructures in misconceptions. Rather than destroy misconceptions to make way for correct physics, instructors should reduce a misconception to its constituent p-prims and properly organize these to construct valid knowledge. p-prims are not themselves incorrect nor correct. As an example, "Maintaining agency" is a p-prime that is often incorrectly applied to "motion is caused by force" but can be correctly applied to building an understanding of momentum.

There are goals in addition to content knowledge such as promoting scientific reasoning, habits, and attitudes. Scientific reasoning abilities involves and develops from abilities for argumentation including abilities to identify and evaluate different points of view. These abilities are not sufficiently developed in most nonscientists. Many students in this study merely repeated themselves when asked to explain or defend their statements. In the example, one student was correct and another wrong with respect to Newtonian Mechanics; neither student could give valid evidence or argument. Thus an instructor might choose to develop the students' scientific argument abilities rather than proceed as if "he saw them only through traditional content-oriented perspectives."

Epistemological beliefs influence how students reason in a physics class. Students' beliefs about the course, the knowledge, and the reasoning it will entail, impact student actions. Some students believe that understanding physics means being familiar with a collection of facts and formula, that everyday experiences, are not relevant, and that learning physics means memorizing information supplied by the professor or textbook. Other students believe understanding physics means developing a sense of its underlying principles and coherence, that formalism does represent every day experiences and that learning physics is applying and modifying one's own understanding. Thus, in an extended student debate, the instructor often faces the dilemma of dealing with appropriate but nascent epistemological beliefs in the context of non-appropriate and hardening content misconceptions. Jumping in to "fix" a content misconception can all too often reinforce the poor epistemological idea that all truth comes from the instructor; to not jump in risks a further hardening of a false physics belief.

Inquiry Practices stands the traditional view on its head, arguing that social participation in the scientific community is a requirement to build individual knowledge and ability. By this view, students and physicists participate in socially constructed situated practice; scientific knowledge and practice are collective constructs of the scientific community. Fundamentally, learning physics means becoming a member, and adopting the practices, of the community of physicists. These practices can be quirky and arbitrary as seen from the outside. In the excerpted class conversation, Harry responded to Amelia's concern that outer space has gas in it, which would slow down a moving ball. His response "We're talking about ideal space" was both meant and received as a joke, with the participants laughing. Assuming ideal conditions is not natural or routine for this class. There was much discussion about whether assuming no friction on an earth-bound surface also meant that gravity had to be turned off. Of course, from a thermodynamical point of view, a physicist would have to agree. In Newtonian force problems, this is dismissed under the rubric "ideal". Under Inquiry Practice, instructors intervene to establish certain social practices rather than to develop individual knowledge and abilities.

Considerations not addressed by paper 31 include: mathematical ability, use of formalism, gender differences, and learning anxieties. Hammer's PER agenda for further work is: (a) distinguish the goal of developing a coherent theoretical framework from the goal of improving current physics education; (b) cultivate a stance of inquiry in instruction; (c) develop a variety of conceptual resources in addition to technical resources; (d) develop narrative accounts of authentic episodes of learning and instruction; and (e) study instructors use of perspectives. The author closes with five additional thoughts. First, we do not have a shared precise definition of "understanding a concept" and so the FCI can only be an alternative lens through which to consider student progress. FCI scores are valuable but not objective measurements because there is not agreement on exactly what the FCI measures. Second, learning to teach should involve developing the skills to gather information, including skills of moderating discussions and interviewing students. Further, forums need to be created for conversation among currently isolated instructors. Third, teachers need a flexible awareness of students' strengths and needs. Multiple perspectives on knowledge, reasoning and learning facilitate this flexibility and reduce misdiagnosing. An inappropriate "treatment" can do more harm than good. Fourth, written narratives or annotated video tapes bridge the gap between research and instruction. Finally, fifth, there is a difference between personal conceptual resources and public articulated perspectives and this definitely needs to be explored in authentic conditions.

The parallelism between traditional instructors and physics-naive students early in the paper had a bit of a bite to it for those who have read some of the research on student misconceptions (paper 36-40). Force is the most studied and written on concept, with 30 specific misconceptions listed in the 1992 FCI paper (paper 1), and continuing commentary through a June 2001 letter to the editors of AJP entitled "The Word 'Force'". At this juncture, I am confused by the literature's use of 'epistemology', looking it up in Webster's hasn't helped. I sense two uses. First, a broad use where constructivism is rooted in epistemological principles like the construction principle, the assimilation principle, the accommodation principle, etc., of our first paper. Second, a narrow use where we are probing student epistemological beliefs such as "all truth comes from the teacher", with the first broader field labeled educational theory. While constructivism can be labeled an educational theory, it is by far, not the only one, and nowhere in PER is it compared or contrasted to its competitors. So, I'm assuming we're already one step removed from pure educational theory which just increases my confusion about how these authors use the word 'epistemology'. Prior to this reading, epistemology was not part of my six-year-acquired educational theory vocabulary whereas constructivism, Socratic Method, and peer instruction, for example, were. Moving on, Hammer makes a point by noting the student laughter to "ideal space". Their laughter isn't actually odd. What's odd is that ideal conditions have any practical usefulness given the great complexity of reality. In atmospheric physics, the Kolmogorov model is insanely simple compared to the chaotic, turbulent, real atmosphere, yet the models 5/6th power prediction matches perfectly experimental reality. It's amazing that a spherical cow has any uses what-so-ever, much less, so many, that are so useful. The major problem with comments like "The disagreement concerns what, precisely, the FCI measures..." is that <g> is the FCI's secondary use. The FCI's primary use is as a student diagnostic test to tell what specific misconceptions about force existed within a student body so that these misconceptions might be addressed by instruction, and that is precisely what it measures, what specific misconceptions - out of 30 given - exist in a student body. That the FCI has not been used for this purpose in the literature does not mean that there should be legitimate questions as to exactly what it measures. There are very legitimate questions as to how far to stretch the use of <g>. While force is critical, it is only a small part of physics content. To build entire course syllabi to maximize <g> was never the original intended use of the FCI. The fundamental problem with the FCI is that it is unique. PER needs a dozen FCIs focusing on different concepts of equal importance to force. A last comment, teachers aren't quite as isolated as they once were; paper 2 notes the existence of PHYS-L and PHYSLinR nets which are respectively computer networks of physics teachers and physics education researchers.


27.3 : Paper 32, Maryland Physics Expectations Survey

Finally, we get to the last assessment tool. This reference introduces a 34 question survey that provides instructors with insight into aspects of learning other than physics content. It's a mix of course-specific student decisions and student-specific epistemological beliefs. Getting to know your students' beliefs and needs is an I.E. requirement, and this tool is helpful.


E. Redish, J. Saul, and R. Steinberg, "Student expectations in introductory physics," Am. J. Phys. 66(3), 212-224 (1998)


What students expect to happen in their physics course, plays a critical role in how they respond to the course. Their expectations play a role in what the students pay attention to and what they choose to ignore. Expectations are a factor in a student's selection of activities by which he will construct his own knowledge. Maryland Physics Expectations (MPEX) survey is an agree-disagree questionnaire developed to probe student expectations about the process of learning physics and the structure of physics knowledge, rather than physics content. In physics courses, there is a hidden curriculum of goals not listed in syllabi; these goals include:


getting students to make connections, understand limitations and conditions on the applicability of equations, build their physical intuition, bring their personal experience to bear on their problem solving and the real world.


The MPEX focuses on cognitive attitudes that effect what students choose to do. It concentrates on what happens in the student, rather than on what the teacher is doing. Students seek efficiency, that is the achievement of a satisfactory grade with the least possible effort. This often results in a severe unnoticed penalty on how much they learn. Paper 32 provides an overview of previous research on cognitive expectations. Simplistically, student expectations matter. From their expectations, it is possible to classify students into several categories. Young adults frequently are in a stage called binary, in which they expect to learn "the truth" from authorities. Students who want to become creative scientists need to move beyond the binary stage to the constructivist stage. In the constructivist stage, students carry out their own evaluation of an approach, equation, or result; they understand both the conditions of validity and the relationship to fundamental physical principles.

The MPEX gathers information on the distribution of student expectations in large populations. The best way to confirm or correct informal observations and to find out what the students really think, is to conduct repeated, detailed, taped, and transcribed interviews with individual students. David Hammer interviewed six students for ten hours each over the course of a semester for his Ph.D. thesis at Berkeley. These interviews were taped, transcribed, and the students classified. This is a prohibitively expensive procedure for a large number of subjects.

The MPEX has 34 items probing six dimensions, three of which match Hammer's study. The dimensions are (1) Independence, (2) Coherence, (3) Concepts, (4) Reality Link, (5) Math Link, and (6) Effort. Paper 32 recommends 20-30 minutes for taking the MPEX. It provides matched pre and post-instruction data on 1500 students at 6 institutions, and percentages are provided along with several pages of discussion. Validation of the MPEX involved over 100 hours of videotaped student interviews, during which it was found that students are not always consistent in responding to similar questions and situations. The authors believe this is not a failure of the survey but rather a function of the complex nature of human cognition. Because of this complex nature and the simple fact that self-reported perceptions may not match actual behavior, the results of the MPEX for individuals can be misleading and the results for groups can understate unfavorable student characteristics. Paper 32 defines "expert response" and "favorable response" as the same thing. They are the preferred response of Group 5 at an 80% agreement level. Group 5 was nineteen college and university teachers implementing "Workshop Physics" in their classrooms after attending a workshop at Dickinson College. Group 5 was asked to respond with "the answer they would prefer their students to give." Other calibration groups were compared to Group 5. These included engineering students at UMCP [University of Maryland, College Park], members of the US International Physics Olympics Team, high school teachers attending a Dickinson College Summer Seminar, and college teachers attending the same seminar. These groups were plotted on an Agree-Disagree (A-D) plot, and compared to each other on additional A-D plots. The paper also compares the individual dimensions to the average for a single group.

The MPEX survey provides a quantitative measure of characteristics which Group 5 hope and expect their students to have. The observations were: (1) in all six reported schools, the initial state of the students was substantially below the initial state of the experts; and (2) the result of instruction was a lowering of favorable responses! This reference uses the six schools' data to flesh out all six dimensions. It provides a paragraph about each dimension in which the dimension is defined, the survey questions for that dimension are enumerated and discussed, a couple of items are elevated to touchstone status, and various school percentages are compared and contrasted. Paper 32 is worth reading by any teacher who actually gives the MPEX to his students. In their statistical significance section, the authors reduce their five response Likert into a two response Likert. They determine that a 5% shift is significant for schools with 450+ students and that a 10% shift is significant for schools with 100+ students. In addition to the above observations, the authors note that the initial state of students is more favorable in the selective liberal arts institution and less favorable in the two year college as compared to large public universities. The last paragraph of the Implications section is a good summary: "The survey presented here is a first step toward understanding these issues and expanding our understanding of what is really going on in our classrooms."

There are three points I'd like to raise. (1) Taking the test reveals that it's focused on math-based physics classes; it implicitly assumes a traditional algebra-based class as a minimal norm. (2) This reference does a nice job highlighting the foundation of PER, the interview. (3) The labeling changes from a group of 19 instructors implementing Workshop Physics after a Dickinson College workshop into "Group 5", into "expert", into "favorable", basically means that this survey matches your students' responses to the hypothetical idealized "student" responses generated by 19 Workshop Physics instructors, immediately after their own instruction. Not an overwhelming call to arms if there is a mismatch. This reference provides an all-too-honest and prevalent use of the label "expert", i.e. anyone who agrees with the author. With that said, the MPEX is one of the few public epistemological tests available for use by the classroom instructor, and I did use it in my study of the initial knowledge state of the Cabrillo Community College physics student. The upcoming paper argues that the MPEX mixes epistemology with course-specific expectations.


27.4 : Paper 33, Physics Students Learn by Rote


A. Elby, "Another reason that physics students learn by rote," PER Am. J. Phys. Suppl. 67(7), S52-S57 (1999)


Both high and low achieving students "play the game". They distort their behavior to enhance their grades, at the cost of achieving deep understanding of physics. Students' behavior and their epistemological beliefs differ in part because of previous habits and in part because of course-specific beliefs about how to get high grades. Students report spending more time focusing on formulas and practice problems, and less time focusing on concepts and real-life examples, than they would spend if grades didn't matter. Elby addresses methodology and provides some sample survey questions. The survey asks students how they allocate their study time between concepts, formulas, practice problems and real-life examples. One allocation, historical sketches, was dropped after the pilot survey because nobody spent time on it. The students are then asked how Dianna, a hypothetical student who is taking this course pass-fail with a goal of understanding physics more deeply, should allocate her study time. The study focuses on the discrepancy between student self-reported study habits and student-generated Dianna study habits.

Paper 33 is on student perceptions, rather than actual behavior. Students systematically distort their study habits in certain directions. They spend more time focusing on formulas and practice problems, and less time focusing on concepts and real-life examples than they would have Dianna spend. The average total distortion percentage is 25%, with a standard deviation of 12%. There is no significant correlation between grades and total distortion percentage. The more severely a student distorts, the more likely he is to view these distortions as necessary for achieving top grades. It is interesting, however, that 27% of students think Dianna would get better grades than themselves, attributing this to her having far less test anxiety. Students view acquiring a deep understanding of physics as a sufficient, but not necessary condition for doing well on tests. In summary, students spend a disproportionate amount of time focusing on formulas and problem solving algorithms, even when they "know better". They do this because they believe that exams reward this behavior, while in reality, grades and amount of distortion are not correlated.

The author does not attribute distorted study habits entirely to traditional teaching styles as most of the surveys were conducted in reform community college classes. From years of experience and from initial homework assignments, many students take home the lesson that rote understanding works well enough. A later test that requires deep understanding, will often not be viewed as highlighting the inadequacies of rote understanding, but rather will be viewed as unfair, or proof positive that they, the students, are just not good at physics. To counter this, instructors might assign less "rote-able" homework and give conceptual mini-quizzes very early in the course.

Elby's idea that a person would study differently "if grades didn't matter, could be experimentally approached by comparing student study habits between a school where grades are very important, such as Stanford, and a school that does not offer grades, such as used to be the case at UCSC. I agree with the paper 33's basic premise that learning and getting grades are not the same thing even though they may parallel each other. The MPEX assumes the skills to learn and the skills to get a good grade are the same. Questions 13 and 14 are both used in the Independence dimension of the MPEX, yet one starts "My grade in this course...," the other "Learning physics is...." Questions 13 and 14 are never contrasted, only used as if they prove the same thing. An interesting book that goes beyond perception and involves actual behavior is They're Not Dumb, They're Different; Stalking the Second Tier, by Sheila Tobies. Its focus is different from that of this reference, but is well worth reading. My two reservations about the above paper are, first, it's too much a thought experiment to be the final word on this idea, and second, having found a real flaw in the MPEX, the author offers no substitute. Thus, an instructor must either create his own instrument or make due with a flawed tool. It is most likely that the MPEX will be used despite its flaws.


27.5 : Paper 34, Productive Learning Resources of Students


D. Hammer, "Student resources for learning introductory physics," PER Am. J. Phys. Suppl. 68(7), S52-S59 (2000)


Hammer notes that the current PER research focus on misconceptions and difficulties has provided practical benefits but is too restricted in scope. He hopes to provide some balance by bringing to light the productive learning resources of students. Hammer structures his paper around resources: a rough idea of resource, conceptual resources, and epistemological resources. He uses the following quote to illustrate his point (after reading the quote, please take a minute or two and work on it before reading on):


Suppose you place a box in a stream of water, and suppose the temperature of the water is 20ºC. If the temperature of the box is less than 20ºC, then the effect of water flowing over the box will be to raise its temperature; if the temperature of the box is greater than 20ºC, then the effect of the water flowing will be to reduce its temperature. Of course, there may be other factors as well: the box may have an internal source of energy; it may be in thermal contact with the air or with the ground, either of which could have a different temperature. Still, if the box is warmer than 20ºC, the water cools it, and if the box is cooler than 20ºC, the water warms it. Now suppose you place the box in a "stream" of sunlight. What is the corresponding temperature of the box, if there is one, such that if the box is cooler that that temperature the effect of the sunlight is to warm it, and if the box is warmer than that temperature, the effect of the sunlight is to cool it (more rapidly, that is, than the box would cool in the absence of sunlight)?


Hammer speaks in more detail, but basically if you thought about this for a minute or two, what you probably did was use different ideas (resources) and try to resolved conflicts between what each application in isolation told you. No resource was actually wrong, just perhaps not useful to solving this particular problem. Indeed, much of the actual work is reconciling the conflict, or bringing the resources into coherency. The essential part is that mental phenomena are attributed to the action of many agents acting in parallel, sometimes coherently, sometimes not. This is in contrast to the idea of misconception, which by its very name implies a conception, a single cognitive unit that is judged as to its conformity with expert understanding. Previous research has shown that instructors often have inappropriate presumptions as to available student resources. Still, students do construct new knowledge out of prior knowledge. Students have prior knowledge, and it's a student conceptual resource. Awareness of this knowledge can have a positive impact on instructional practices.

Hammer speaks to the ideas of anchoring conceptions and bridging analogies. He describes in some detail Minstrell's use of the idea of springs to help students understand the passive force a table exerts on a book. The students' preexisting ideas about springs are the anchor that the instructor uses to buttress his analogies into the confusing idea of a passive force. The author describes the way Elby pushes the use of students' resources one step further by taking what many would view as a misconception and reweaving this "raw intuition" into correct physics. This description is based on the truck and car collision misconception that the forces are not the same and reweaving this, via the acknowledgment that the speed gained and lost is intuitively and correctly not equal, into correct and intuitive physics. Hammer next discusses di Sessa's p-prims. The author uses the classic example that students say summer is warmer because the earth is closer to the sun. The usual interpretation is that students have the faulty concept that the earth has a highly eccentric elliptical orbit. It may instead be that in a quick search, the first general resource identified is the p-prim: closer means stronger. Although misapplied, the p-prim itself is not wrong; music is louder closer to a speaker, light more intense closer to a light bulb. In p-prim use, anchoring concepts activate productive resources and bridging carries those activated resources back to the original problem.

Instruction can be designed to help students use their existing resources more productively. Students often misapply the attributes of the coordination class, objects, to waves. For example, students often expect that the impact of a sound wave will propel a dust particle across the room. Hammer argues that the coordination class, events, would be a more beneficial starting point from which to explore waves. The author shows Hewitt appealing to the student p-prim, balancing, when Hewitt writes his equations in exaggerated or diminished symbols:

Ft = Ft

In another context, the student may not access the p-prim, balancing, and thus may not access the understanding based upon it. It is essential to articulate, examine, and refine the instructor's understanding of student resources. The details of this understanding may have significant consequences in how instructors attend and respond to student thinking.

Epistemological resources are defined as the student's understanding of the nature of knowledge and how it is obtained. These are less studied than conceptual resources but are of equal importance. As with conceptual studies, most epistemological work has focused on misbeliefs and how to correct them. Recent research goes below beliefs to find epistemological resources (ep-resources) which themselves are neither right nor wrong, but rather are applied or misapplied. The author goes back to his sunlight-on-a-box question and speaks to the ep-resources used. As a physicist, you know to not always trust your first idea. You know to compare different ways of thinking with each other. You know to monitor for coherence in your understanding, and to address inconsistencies when you find them. For example, you may have quickly decided that the sunlight can only add energy to the box, and then spend most of your time trying to identify specifically why reasoning in terms of equilibrium does not work. In other contexts, such as in deciding what to have for dinner, once you decide on grilled salmon, you would stop thinking about the question. It would be odd to spend time trying to identify specifically what would be wrong with choosing lasagna. For some students, the two situations, box and dinner, may activate the same ep-resources; they may think it's just as odd to continue thinking about a physics problem once they have chosen an answer, as to continue thinking about a dinner selection once they have chosen an entree. Thus, part of learning physics is learning when to activate which epistemological resource.

A core difference between traditional and reformed physics instruction may be the ep-resources activated. In traditional instruction, the authority figure gives facts to students to accept and remember. In reform instruction, students construct knowledge from experimental observations, with formal approximations of reality being tried and adjusted to improve performance. Epistemological anchors are critical to metaphors or analysis. A classic one is:


instructors compare mental exertion to physical exertion, to help students think of knowledge and ability as developed through effort. In that case, the context of physical exercise serves as the epistemological anchor, a context in which students naturally associate effort and persistence with improvement.


Instructors who expect productive resources will be inclined to look for them in their students' reasoning and help students look for them, themselves. These strategies presume that student resources are mostly in place and must merely be reorganized to align with scientific knowledge and practices. At an early age, however, the focus should be on formation of intellectual resources, such as: closer means stronger (conceptual), or I see it (epistemological). This formation may come prior to alignment, with early science education mostly being messing about. Science cannot end there, but Messing About is a better beginning than Remember the Magic Word.

A few points, first, it's interesting how epistemological resources are called context dependent several times. One of these was a warning that absent the context, the information built by these resources may itself be inaccessible. Second, this is obviously a theory paper, loaded as it is with qualifiers. This is most pronounced when "core differences" only "may be". Hammer is in need of an experimentalist to turn his "they may think" into "they think" or "they do not think". Third, I strongly agree with his advocacy for early messing about. My theory is that traditional education focuses on formalizing an existing set of knowledge, formed from out-of-school activities. My anecdotal story is about my father. My father's obsession with and amateur work on cars predated his eighth grade auto shop by several years. That class formalized and structured existing knowledge, adding only details. To this day, my father is proud of the formal nature of that class; he wrote a paper on the V-8 engine. In his day, kids messed about at home, at friends', or at work. They got structure and formal detail at school and valued it. To the extent that students have never messed about before your class, is to the extent you are now expected to convert class time into mess-about time. This must happen for two reasons. The first reason is that without previous messing about there are no pieces for traditional instruction to formalize or put together. Traditional instruction constructs comparatively little. The second reason is that if you don't have the puzzle pieces to put together traditionally, you have to make puzzle pieces via I.E. methods. Modern cars and computers are too expensive and too black boxish to be of much use for potential Thomas Alva Edisons. Having assayed my attempt at theory, we will now address the last paper in the epistemology section. Two papers by Redish, two papers by Hammer, and now the second by Elby.


27.6 : Paper 35, Methods to Advance Epistemological Development


A. Elby, "Helping physics students learn how to learn," PER Am. J. Phys. Suppl. 69(7), S54-S64 (2001)


Epistemological sophistication is valuable. Previous studies show that students' epistemological expertise correlates with academic performance and conceptual understanding in math and science. A student who sees physics knowledge as a coherent web of ideas, has reason to monitor his understanding for consistency and coherence. This monitoring is not activated during rote memorization. Unfortunately, many of the best research-based reform physics curricula, ones that help students obtain a measurably deeper conceptual understanding, generally fail to spur significant epistemological development. Students often revert to their old learning strategies in subsequent courses. This reference shows instructional practices and curricular elements explicitly intended to advance epistemological development.

The students for Elby's study are high school students in California and Virginia. The assessment tools are the MPEX and the Epistemological Beliefs Assessment for Physical Science (EBAPS). The MPEX combines epistemological beliefs and course-specific expectations, and was designed for college students. The EBAPS was constructed to probe epistemology alone, and was designed for high school-level chemistry, physics, and physical science classes. The assessment was administered as for-credit homework in 1997-1999. The MPEX was only given to the Virginia group. The mean gain (post minus pre) was Overall 11%, Independence 11%, Coherence 25%, Concepts 27%, Reality Link 21%, Math 14%, and Effort -18%. The subscales are detailed in paper 32. (As a comparison the best results from paper 32 were those of Dickinson College with Overall 1%, the subscales in order 5%, 8%, 11%, -4%, 1%, and -18%). The substantial deterioration on the Effort subscale was common to all paper 32 and 35 groups. The EBAP's five subscales are Structure of Knowledge, Nature of Learning, Real-life Applicability, Evolving Knowledge and Source of Ability. The first three subscales correlate to Concepts and Coherence, Independence, and Reality Link on the MPEX. The fourth subscale, Evolving Knowledge, deals with scaling scientific knowledge from "tentative" to "settled". The fifth, Sources of Ability, questions whether scientific ability is a fixed natural ability or the result of hard work. There were statistically significant positive gains in the first three subsections and the overall result. The Evolving Knowledge subscale is not addressed by Elby's curricula, in part because the knowledge presented at the introductory level is comparatively settled. Elby tried and failed to effect positive change in the Source of Ability subsection of the EBAP. The favorable beliefs: the coherence and conceptualness of physics, the constructive nature of learning, the link between physics and the real world, and the meaningfulness of mathematical equations, come at the expense of content coverage but not at the expense of basic conceptual development. A focus on epistemology needs to permeate the class in order to have significant effect.

This reference presents two labs, Newton's 2nd Law and Newton's 3rd Law, that highlight Einstein's epistemological view that science is the refinement of everyday thinking. The author interweaves explicit multiple choice, and leading multipart epistemological (ep-) questions into labs that otherwise resemble University of Washington tutorials or Real Time Physics labs. The students are lead by these ep-questions to view their intuitions as useful starting points that with some explicit guidance, lead to scientific understanding. Physics involves refining, rather than selectively ignoring, your everyday thinking. There is also a class discussion underscoring the ep-point the following day. The following is an example of an ep-question embedded in a Newton's 2nd Law lab:


3. Most people have - or can at least understand - the intuition that the forward force must "beat" the backward force, or else the car wouldn't move. But as we just saw, when the car cruises at a steady velocity, Newton's second law says that the forward force merely equals the backward force; Fnet = 0. Which of the following choices best expresses your sense about what's going on here?

a) Fnet = ma doesn't always apply, especially when there's no acceleration.

b) Fnet = ma applies here. Although common sense usually agrees with physics formulas, Fnet = ma is kind of an exception.

c) Fnet = ma applies here, and disagrees with common sense. But we couldn't expect formulas to agree with common sense.

d) Fnet = ma applies here and appears to disagree with common sense. But there's probably a way to reconcile that equation with intuitive thinking, though we haven't yet seen how.

e) Fnet = ma applies here. It agrees with common sense in some respects but not in other respects.

Explain your view in a few sentences.


Ep-questions were also assigned as homework and as in-class problems. They were graded on completeness, not content. The responses helped the teacher plan subsequent classes, and "nudge" individual students. Some example ep-questions and their responses are provided.

Elby has a page detailing his interesting homework philosophy. The philosophy hinges on grading effort, not correctness, and in handing out a partial solution set with the assignment. Elby argues that traditional students view homework as grade-getting rather than learning. His unique homework methodology combined with frequent in-class mini-quizzes was explicitly designed to push students toward the realization that thinking through problems, not copying each other or a book, is the best way to learn physics. Elby presents some sample test questions. To reward qualitative conceptual reasoning, he asks a high percentage of conceptual questions on both homework and tests. While he does include many standard quantitative problems, Elby avoids plug-and-chug problems. His approach reinforces the concept that physics knowledge is more conceptual than factual and that rote application of equations is useless. Because extensive reliance on the traditional huge-range-of-topics textbook would have undermined the author's ep-agenda, he used the textbook primarily to introduce factual information.

The cost of deep understanding and discussions of epistemological issues is coverage. For example, in California, most students did acquire a basic conceptual understanding of force and motion in one dimension, energy, waves, optics and aspects of electrostatics. Momentum, oscillating motion, electric potential, electric circuits, magnetism, and all of modern physics were skipped. In Virginia, Elby was part of a department in which all the teachers used the same tests. Thus, he covered more material at least qualitatively and took an active role in the common test development. Separating instructor verses curricula effect is not possible in Elby's paper but should become clear as more instructors implement ep-curricula.

While the data is from a single high school class, it compares quite favorably to the data in Paper 32. The glaring exception is the -18% in Effort. The -18% matches data presented in paper 32, but is particularly distressing here as Elby made serious directed efforts to improve this category. Breaking the Effort category out by student grade subsets would be interesting. Paper 29 mentions that students engage in psychological damage control and do less work if they aren't doing well in the class. Paper 35 is a how-to-teach paper placed in this chapter because of its explicit epistemological motivation. It's admirable and rare that the author lists both the curricula topics covered and those not covered. Sadly, buoyancy makes neither list!

Epistemology argues that how students learn is critically dependent on what they already believe. What they believe is divisible into that which is in agreement with scientific community beliefs, and that which is in disagreement. That which is in agreement is useful in bridging techniques and as foundations for future knowledge. That which is in disagreement must be dismantled. PER focuses on the initial knowledge state of the student. We need to know what students believe before we can use or dismantle the belief. It's particularly important to get rid of misconceptions because the human mind can and does hold mutually exclusive beliefs. Thus, it is possible for new scientific and old nonscientific beliefs to coexist. Cohabitation of beliefs in our mind is not the goal of education if one of them is demonstratively wrong. The upcoming group of papers (36 - 40) focuses on Student Misconceptions. They're quite interesting. Much of early PER was focused on Force. From there, PER branched into Mechanics via the intermediate step of Newton's Laws. The more current work on student misconceptions has moved into electrostatics, optics, special relativity and thermodynamics.



CHAPTER 28 : Student Misconceptions


28.1 : Paper 36, The Reasoning Behind the Words in Electrostatics

We start off this chapter on student misconceptions, fully aware that Hammer for one would argue it's a misnomer. Still, it's the most common label, followed by naive beliefs, nonscientific preconceptions, incorrect ideas, common sense, misinterpretations, tendencies, etc. Also, please note that other papers have examples of student misconceptions, particularly papers 5, 6, and 7.


R. Harrington, "Discovering the reasoning behind the words: An example from electrostatics," PER Am. J. Phys. Suppl. 67(7), S58-S59 (1999)


Many students have incorrect physics. They also can interpret what an instructor or a textbook says in a way that is reasonable but incorrect. This is particularly true for previously unknown material. Harrington's note speaks to two such misinterpretations. The first is what does "neutral" mean in electrostatics. Depending on surveyed population, between 5 and 28 percent of respondents believe that neutral and negatively charged are synonyms. As one student is quoted: "It is negative because it's not charged." An elementary school teacher elaborated: "Doesn't positive mean yes and negative mean no?" Thus negative charge slips to no charge slips to neutral and they are seen (incorrectly) as synonyms. This was a difficult fact to find. Students can use correct terminology and state correct physics principles and still fail to give a correct explanation for a physical situation as the author highlights by an example. Another example of a misinterpretation in electrostatics is the statement "charge moves readily on a conductor but not on an insulator." Students often replace "on" with "onto" in their thinking. This is consistent with the naive belief that insulators protect you and therefore cannot be charged. Prior to instruction, 75% - and after instruction 40% - of the sampled population believe that a charged object must be a conductor. The author provides two research questions that can bring out these issues in a student population.

I believe there is a physics misconception in the above note: how charge interacts with insulators. There are two additional issues, however. One is word definition; the other is English language ability. Paper 1, the FCI paper, has the following comment: "Student common sense beliefs are often metaphorical, vague, and situation dependent; language use facilitates vagueness by turning such words as force, energy, and power into synonyms." Two "interesting problems" were dropped from the prototype FCI prior to publication, because the problems were "misread more often than not." In fact, 5 out of 8 American-born graduate students, who were given the FCI and interviewed afterwards by the FCI authors, exhibited "moderate to severe difficulty understanding English text." Not only would a math diagnostic test give interesting results, so too would a reading diagnostic. I believe that many instructors rely on students having math and language skills much better than students often possess. I base this belief on six years of teaching in secondary schools where I gave many math diagnostic tests, and three years of TAing where I hand-graded many tests and lab reports. I've never given a language diagnostic and would like to and will; although due to time constraints, I must delay doing so until my employment as a community college instructor. In regards to paper 36 itself, it's interesting to see how Harrington asserts that previously unknown information can be corrupted by a teaching process which bridges back to uncorrected misconceptions.


28.2 : Paper 37, Student Difficulties in Applying a Wave Model

McDermott has written over 60 papers and is the leader of the University of Washington's Physics Education Group. She is the author of "Tutorials in Introductory Physics" and "Physics by Inquiry". While her end product is how-to-teach material, her papers tend to focus on the results of her research into identifying a specific misconception. She will then, via an irritative method, build a tutorial to correct the misconception in accordance with her procedure: elicit a known misconception as a prediction from the student, confront the student via a carefully pre-planned experiment/demonstration that violates the student's prediction, and then resolve this conflict by providing an experiment/demonstration-supported scientific replacement for the misconception. For this structure to succeed, she needs a known misconception, preferably one prevalent in a large percentage of the population and one reoccurring in each freshman class. We will find out quite a bit more about McDermott in the Awards section as I will be reviewing her Oersted Medal Lecture 2001 paper (paper 47).


K. Wosilait, P. Heron, P. Shaffer, and L. McDermott, "Addressing student difficulties in applying a wave model to the interference and diffraction of light," PER Am. J. Phys. Suppl. 67(7), S5-S15 (1999)


Neither introductory nor advanced students have a functional understanding of either the ray or the wave model of light. Other papers have focused on the ray model, on the photoelectric effect, and on serious difficulties including path length difference and phase difference. Paper 37 describes a guided inquiry method for students to develop a basic wave model for light. This paper - as do all Physics Education Group papers out of UW - briefly describes the context in which a given tutorial is researched and developed. The number of students, the time demands of the course, the emphasis of the tutorials, the informal tutorial structure, the use of teaching assistants, and the "success" definition are all standard to McDermott's work.

The tutorial system represents an attempt to secure the mental engagement of students in large classes. Tutorials replace the typical discussion/recitation part of the traditional lecture, lab, discussion triad. The tutorial pretest is provided. After lecture and laboratory instruction, only 10% of about 1200 students used the correct method of solution on the pretest. Post-tutorial, 70% used the correct method. Correct answers were 5% and 45% respectively. The pretest involves two small objects oscillating in unison in a ripple tank and asks students to determine whether constructive or destructive interference occurs at 3 specified points. Ripple tanks are used because they are a concrete environment with ease of visibility. The tutorial explicitly addresses superposition and the critical role of path length difference. The students use paper and transparencies of identical concentric circles to connect source separation with the nodal lines. The homework part of the tutorial, helps students derive d*sinq = ml and justify the approximations used. Students need explicit help in connecting the idea of sufficiently narrow slits to point sources.

The next tutorial is on light. Students are given a double-slit pattern and asked what the picture would look like if one slit is covered. Roughly 30% of the students answered this pretest correctly. This tutorial uses sponges to create slits of various widths in the water tank. A dowel is rolled back and forth to create a wave, and what makes it past the sponges is equated back to point sources, for the narrow slits. After this, most students get the pretest concept. Superposition yields very different results for ray versus wave optics. About 45% of students use a hybrid model. The tutorial confronts this with an exercise designed to help students recognize that geometric optics cannot be used for light passing through very narrow slits. The tutorial also addresses how changes in the optical system result in different interference patterns. The students are shown two interference patterns and asked which single change in the system would cause the given shift in patterns. Only about 25% of the students who had had lecture instruction on double-slit interference got it right. Worse, the students had completed the tutorial on two-source interference and did not recognize the relevance to the situation at hand. Additional exercises were designed to help make this connection. Further, there is the big step of going from two slits to N slits. In a new test, a two slit mask is replaced by a three slit mask of equal slit spacing. Students are shown what a couple of points look like with the two slit mask in place and asked how the points change with a mask substitution to three slits. Only 5% of students used correct reasoning. Most students failed to consider path length differences and superposition.

Another tutorial in this series is developed. Transparencies of sinusoidal curves are used in 2, 3, and 4 slit problems in which trial and error is used to note patterns in principle maxima and the number of minima between them. The tutorial does not use phasor diagrams. Students do not readily associate a phasor with the value of an oscillating quantity and confuse the angle between two phasors with the spatial angle. In post-tests for both two-source interference in water and multiple-slit interference in light, there was marked improvement. In water, 70% used the correct method, "success" in this case being at the 55% mark. In light, 40% used the correct method, "success" in this case being at the 25% mark. "Success" for all McDermott tutorials vary. Success is defined as having the undergraduate student post-tutorial class-average test percentage be greater than the graduate student pre-tutorial in-service average test percentage. The graduate students actually take the post-test as part of their in-service preparation prior to teaching the tutorial to the undergraduates.

The next tutorial extends the multiple slit model to single slit diffraction. The students are guided through the process (via a pairing procedure) of limiting identical, very narrow, evenly spaced slits until their separation vanishes. This tutorial addresses three misconceptions: diffraction is solely an edge effect, central maximum narrows as slit narrows, and slit width must be less than wavelength for diffraction to occur. For the diffraction tutorial, 5% passed the pretest with correct reasoning, and 40% passed the post. Success was at the 15% mark.

The final tutorial in this series combines interference and diffraction using two slits of finite width. Unless explicitly asked to go through the necessary reasoning, many students fail to connect the general case of finite slit width to the limiting case of very narrow slits. For this tutorial, correct answer percentages are given instead of the normal, correct reasoning percentages. Further, no graduate student score is provided; pretest was 40% and post-test 75%. Most students can apply the model developed here to diffraction gratings without more help. It is crucial that students be given the opportunity to go step-by-step through the reasoning involved in the development and application of important concepts. If not explicitly addressed, serious difficulties with the wave model may persist and may impede understanding of more advanced topics, such as the wave nature of matter and the photon model for light. A sound qualitative understanding often improves the ability of students to solve quantitative problems. Advanced study in physics does not necessarily overcome serious difficulties with basic material. Ongoing preparation of graduate teaching assistants in both subject matter and the instructional method has been an important additional benefit.

Paper 37 was more a demonstration of how tutorials are developed until there is an absence of need than a discussion of misconceptions. The tutorials in this series were explicitly motivated because students failed to transfer previously learned material into new arenas. While there is mention of a few specific misconceptions, this reference mostly demonstrates the need for students to see an idea repeated in many different contexts. My review assumes the reader has previously read a McDermott paper, just not this specific one; an assumption which is probably false. I will take the opportunity to claim this as a "teaching moment" and ask you, the reader, to read one of her papers. If you like Mechanics, she does a nice paper on the Atwood Machine [Am. J. Phys. 62(1), 46-55 (1994)]. You will note that she does not use the FCI or any other literature test. Her initial process is individual student interviews in which the student does some small physics experiment and verbally interacts with the interviewer. I am curious about how much her graduate student cohorts have varied over the years as she uses them to define success. I also wonder whether there is a lowering of undergraduate pretest scores, which are pre-tutorial but post-lecture, because during lecture the students know they are going to see it again and thus are more willing to let the first opportunity slip by. My major complaint with this specific paper is the alternating between "correct reasoning" and "correct answer" percentages. A table is given which provides both, but the written discussion picks the more favorable set of percentages for no other apparent rhyme or reason than that they are more favorable. Also, the authors do not address why up to half of the students with correct reasoning are then unable to go on and get the correct answer.

Paper 37 had a follow-up in 2000: "Student understanding of the wave nature of matter: Diffraction and interference of particles" was published in PER Am. J. Phys. Suppl. 68(7). The 2000 paper moves from light, to electrons; it raises two basic points. Students continue to show an inability to interpret diffraction and interference in terms of a basic wave model, and students lack a fundamental understanding of the de Broglie wavelength. In particular, students failed to see that the de Broglie wavelength is a function of momentum. Further, v=lf and lE = hc were often measured. Unlike previous results, students who had worked through the light wave tutorial did significantly better on both the pre and post-tests than did other students, including those in more advanced classes such as Quantum Mechanics. McDermott continues to use her sets of code words: (elicit, confront, resolve) and (apply, reflect, generalize). Only here, the confrontation relies on photographs of electron behavior rather than actual experiments. In a reflection of literature, the 2000 paper internally divides a class and compares / contrasts the results of each subdivision.


28.3 : Paper 38, Student Difficulties with Rays

Paper 38 also focuses on optics. I believe that there is an error in it, which in turn has made the paper either suspect or confusing. Of course, I may just have a fundamental problem with optics. You're welcome to venture an opinion on the subject. I include paper 38 because of its basic idea that there are two very different types of rays used in optics. That there are in fact two very different rays used in optics is important, because students tend to not distinguish between them, to their confusion.


P. Colin and L. Viennot, "Using two models in optics: Students' difficulties and suggestions for teaching," PER Am. J. Phys. Suppl. 69(7), S36-S44 (2001)


Students have great difficulty connecting the ray and wave models of light. Paper 38 analyzes situations where diffraction and interference are observed in the presence of lenses. Special attention is paid to student generated "rays". There are cases in which both ray and wave models must be used together. The students in this study are third-year French university students. The error which I believe exists is on pg. S37 the upper right paragraph: contrary to the assertion of the author, the path lengths from ? to M in Fig. 3 are not equal. How this effects the rest of the study is an uncertainty for me. The paper poses three problems involving an object, a lens, and an observation screen. The students were given fifteen minutes and their results were anonymous. The authors used the student produced diagrams and explanations to highlight their paper. This reference confused me more than I enjoy admitting. There were two very different "rays", those that actually carry energy and those used to find phase differences. These two are not differentiated by students partly because textbooks do not make the difference explicit. The paper suggests experiments to highlight the differences between the two types of "ray". In one such experiment, a mask covers part of a lens, and the screen is at the back focal plane. Only the brightness of the image is affected, even though students often erroneously predict that the image will be truncated by the mask. Unmask the lens. If the screen is not at the back focal plane nor at the conjugate plane to the two holes (sources of light), then fringes are seen instead of an image. Turn off the light. Mask half the lens. Ask what will show on the screen. Turn on the light. The fringes do not merely dim. Some stay bright, others disappear entirely. Thus, light paths are distinguishable between those of energy and those of phase. The authors advocate a program of backward selection instead of story-like reading to select the meaning of light paths which in turn ties into the relationship of sources and screen. I suspect to fully value paper 38, you would both have to perform the experiments and talk over the paper with an optics expert.


28.4 : Paper 39, The Relativity of Simultaneity and the Role of Reference Frames

Paper 39 advocates teaching special relativity in introductory classes and as such matches my commentary in the I.E. curricula chapter, but upon rereading, it fell more comfortably into this chapter on student misconceptions. One of the most surprising misconceptions is the confusion surrounding what is an intelligent observer.


R. Scherr, P. Shaffer, and S. Vokos, "Student understanding of time in special relativity: Simultaneity and reference frames," PER Am. J. Phys. Suppl. 69(7), S24-S35 (2001)


There is debate over whether to include modern physics in introductory physics classes. PER information can help inform this debate. PER seeks to identify what students can and cannot do, with the hope that this knowledge can be used to develop instructional materials that deepen student understanding. Paper 39 investigates student understanding of time in special relativity with emphasis on the relativity of simultaneity and the role of reference frames. The authors found many students are unable to: (1) determine the time at which an event occurs, (2) recognize the equivalence of observers at rest relative to one another, or (3) apply the definition of simultaneity. This reference illustrates the process used to achieve knowledge of these misconceptions. There is very little previous research on student understanding of relativity. The results of what has been done are that students believe: (a) reference frames have limited physical extent; (b) an object's motion is intrinsic, not a quantity measured relative to a reference frame; (c) certain relativistic effects are distortions of perception; and (d) the order of events depends on observer location.

Paper 39's research focuses on three aspects, the ability to: (1) identify relevant events, (2) determine the time the event occurred by correcting for signal travel time, and (3) recognize that the time interval between two events is not invariant but depends on the reference frame. The paper devotes several paragraphs to discussing and defining reference frame, position of an event, time of an event, and simultaneous. This is a five year study involving fourteen instructors and eight hundred students from various universities and encompassing various student subgroups. Both student written responses and individual student interviews were analyzed. The research questions are provided and named: Spacecraft, Explosions, and Seismologist. All questions involved two observers with given relative motion. The time ordering of events is given for one observer and desired for the other. Examples of correct qualitative and quantitative answers are provided.

The first version of the Spacecraft question was undirected and given to seven graduate students and to twenty advanced undergraduates. Ninety percent of the students drew space-time diagrams with correct features, but mistakenly had the events as simultaneous in both frames. Only one graduate student spontaneously recognized the relativity of simultaneity. The second version was more directed with explicit questions. On the second version, most students stated that the events were not simultaneous, but most reasoned incorrectly focusing on relative position not relative velocity. Paper 39 then presents a three-part detailed investigation. Part A: belief that events are simultaneous if an observer receives signals from the events at the same instant. Part B: belief that simultaneity is absolute. Part C: belief that every observer constitutes a distinct reference frame.

Part A is probed with a location-specific (3rd) version of the Spacecraft question. Fewer than twenty-five percent of the students in any subgroup (including graduate students) gave correct answers with correct reasoning. The paper illustrates the two major modes of incorrect reasoning with student quotes from interviews. These are: (1) tendency to associate the time of an event with the time at which an observer receives a signal from the event; and (2) tendency to regard the observer as dependent only on his or her personal sensory experiences.

Part B is explored by the 4th version of the Spacecraft question, which explicitly in nontechnical language makes it clear that reception events are not the ones to consider. The question makes explicit what intelligent observers who correct for signal travel time actually means. After development via interviews, the question was given as a written qualifying exam question for doctoral candidacy, seven out of twenty-three examinees answered correctly. Three tendencies were found: (1) tendency to regard the relativity of simultaneity as an artifact of signal travel time, (2) tendency to regard the Lorentz Transform for time as correcting for signal travel time, and (3) tendency to treat simultaneity as independent of relative motion.

Part C shifts the focus from simultaneity to reference frame. Part C uses two new questions: the Explosions and the Seismologist. These questions reveal two tendencies with multiple illustrative student quotes. The first tendency is to treat observers at the same location as being in the same reference frame, independent of relative motion. The second tendency is to treat observers at rest relative to one another but at different locations as being in separate reference frames.

In conclusion, paper 39 has identified widespread student difficulties with the definition of the time of an event and with the role of intelligent observers. After instruction, more than two-thirds of undergraduates and one-third of graduates were unable to apply the construct of a reference frame in determining whether two events are simultaneous. Many students interpret the phrase "relativity of simultaneity" as implying that the simultaneity of events is determined by an observer on the basis of the reception of light signals. They often attribute the relativity of simultaneity to the difference in signal travel time for different observers. In this way, students reconcile statements of relativity of simultaneity with belief in absolute simultaneity and fail to confront the startling ideas of special relativity. Furthermore, they have significant difficulties with reference frame foundation concepts. In particular, many students do not think of a reference frame as a system of observers that determine the same time for any given event. Such difficulties impede student understanding of both relativity of simultaneity and their ability to apply Lorentz Transforms correctly. For most people, the implications of special relativity are in strong conflict with their intuition. For students to recognize the conflict and appreciate its resolution, they need a functional understanding of the very basic concepts highlighted in this reference.

The "PER seeks" definition at the beginning of the above paper is particularly well stated. The use of correct answer with correct reasoning is also good. The "correct reasoning" is almost always qualitative. The problem arises because the "correct answer" can either be harder to get (the answer to quantitative work) or easier to get (50% chance on a true/false question) than the correct reasoning is to get. So if reasoning and answer are split, the possibility of vagueness on the author's part and/or confusion on the reader's part is enhanced. A final note on paper 39 is its use of graduate students. Seven out of twenty-three is a bit rough; the authors of the FCI (paper 1) are even rougher. Sixteen first-year graduate students were given the FCI and then interviewed for half an hour each. Two had perfect scores. Three exhibited severe misunderstandings to the point of not understanding Newton's Third Law. Fourteen did not understand the buoyancy force concept. While five were able to state Archimedes Principle; they unfortunately did not understand the role of the pressure gradient. Looking back at McDermott's paper where graduate students' pretests are the benchmark for tutorial success, it's unsettling to note the fifteen percent. On the whole, graduate students are the Rodney Dangerfields of PER. The upcoming paper is rare; it focuses on Thermodynamics. The last paper to do so was paper 7 on the Thermal Concept Evaluation.


28.5 : Paper 40, Student Understanding of the First Law of Thermodynamics


M. Loverude, C. Kautz, and P. Heron, "Student understanding of the first law of thermodynamics: Relating work to the adiabatic compression of ideal gas," Am. J. Phys. 70(2), 137-148 (2002)


Research has identified specific students difficulties in thermodynamics. In particular, students confuse the concepts of heat, temperature and internal energy. Thermodynamics needs a curricular development approach similar to research-driven changes in kinematics and dynamics. Paper 40 examines whether students given a familiar situation could (1) recognized that the First Law is relevant, (2) decide whether there is heat transfer or work done, and if so determine the signs, and (3) relate these quantities to a change in the internal energy of the system. Previous research has indicated that students treat heat as a substance residing in a body, that they fail to distinguish between state and process quantities, and that they apply the ideal gas law inappropriately. Paper 40's instructional context subsection is distinguished by a thorough thesis on the critical prepositions: to, on, by. The authors also note a tie-in to mechanics through the definition of work on (Won) = F ds. The study's methods are described, with the four thermal problems and the one mechanical-work problem presented. These University of Washington authors used two primary research methods: individual demonstration interviews and written questions administered to large groups. Five hundred students at several universities and at different educational levels participated in this study. Post-instruction correct explanations ranged from 10% correct in the introductory classes to 50% in the advanced classes. Particularly depressing is that the limited pre-post comparison available leads the authors to state: "the amount of instruction did not seem to affect the results."

The students did not use Work, nor the First Law of Thermodynamics to solve the problems; instead they relied on a misuse of the Ideal Gas Law. Difficulties with general principles were often intermingled with difficulties related to the specific system discussed. Misinterpretation of the Ideal Gas Law was common; a couple of problems were found. First is a general faulty understanding of multivariable mathematical relationships. Second is a more specific incorrect set of microscopic ideas. The paper quotes three students, one of whom first implicitly holds T constant and then implicitly holds V constant, in the statement: "Decreasing the volume increases the pressure, which increases the temperature, PV = nRT." Another student stated an inverse relationship between V and T, that while mathematically surprising is (mis)explainable via a microscopic model that ties temperature to number density. As quoted from one student, "The molecules are getting compressed, and they have less space to move around, so they are bumping into each other a lot more, and the temperature increases;" another student adds, "more collisions per unit time equals more heat generated." Students focused on interactions internal to the system, rather than interactions between the system and its environment.

From student interviews, examples of which are presented in paper 40, students have a number of specific difficulties related to the First Law. In broad categories, these are (a) difficulties in discriminating among related concepts (heat, work, temperature, internal energy), and (b) difficulties in applying the definition of work to an ideal gas undergoing a specific process. On a more general level, there is a strong tendency to treat theorems as formulas, and not as mathematical models of important physical principles. This issue has been previously noted in student (mis)understanding of work-energy and impulse-momentum theorems. Confusion between heat, temperature, and internal energy has even been noted even among instructors and textbook authors. To distinguish conceptual versus linguistic issues, students were asked to make correct predictions concerning physical phenomena. The results show that many students use the concepts of heat and internal energy interchangeably, not merely the words. Confusion between work, heat and internal energy is shown by a student who states: "no work is done in an isothermal process because the temperature does not change, so no heat is transferred, and thus no work can be done."

There is failure to recognize that the sign of work is independent of the coordinate system. Some students even spoke of a direction of work as if it was a vector. Students often considered the sign of work as dependent on the direction of a force, or the direction of the displacement. They are thus incorrectly tying back into mechanics, where the sign of work is dependent on the relative directions of force and displacement. Using a mechanics problem to free the concept of work from the complications of thermal physics, the authors found that some specific difficulties persist, while a few difficulties such as reference to sign conventions disappeared. The key difficulty in comparing the mechanical problem to the thermal problems, is that the student had to recognize the relevance of work to the thermal problems, but was directly told this relevance to the mechanical problem. The recognition that a concept applies to a problem is key to fundamental understanding.

Students fail to recognize that work is path dependent in the general case. Far too many students (mis)generalize from their experience with conservative forces in their mechanics courses, with several students explicitly stating "the work is independent of the path taken." Students are especially likely to neglect the path dependence of work in a cyclic process. One student wrote "W = area under curve" and shaded the appropriate area, but finally answered: "Since there was no change in volume overall, there was no work done on the system."

Finally, students fail to recognize that the absolute value of work done on and by the gas must be the same. Students argue that depending on direction of motion, one object is "winning" i.e. doing more work than the other object: "There's some kind of force in the opposite (outward) direction. There is work done by the gas, but the work I do is greater. The net force is in this (inward) direction." Work is treated differently in Mechanics and Thermodynamics; this inconsistency "may make it more difficult for students to transfer a concept initially learned in one context to the other."

Students and physicists both first attempted to use the Ideal Gas Law to solve the four thermal problems in paper 40. The physicists quickly recognized the lack of information and turned to the First Law. The students frequently did not even consider the First Law. Their confidence in the Ideal Gas Law was a significant barrier to their even considering the use of the First Law to solve the problems. Many students confirmed their incorrect macroscopic arguments with reference to an incorrect microscopic model. In fact, students believe the Kinetic Theory of Gasses proves the Ideal Gas Law. Students fail to recognize that the model is designed to agree with experimental observations of macroscopic phenomena and not vice versa. Of those few students who attempted to use the First Law, most failed because of difficulties with the concept of work. Linguistic complications are real, but remedying them alone will not solve the problem; there is genuine inability to distinguish among closely related concepts. The Physics Education Group at the University of Washington is in the midst of an irritative cycle of research, curriculum development, and instruction to develop a tutorial on the First Law. It will be the third of a series including mechanical work and the Ideal Gas Law.

Five comments, (1) textbooks and textbook authors are rarely addressed in PER literature. When they are it's rarely complementary as exampled by this reference. One paper that does address these issues, albeit at the Middle School level, is J. Hubisz's "Report on a Study of Middle School Physical Science Texts" published in The Physics Teacher Vol. 39, Pages 304-309. This two year study on twelve commonly used texts is not complementary. A fundamental problem was that no one person was responsible for a lower level text. In a few extreme cases, the listed "author" was unaware that he or she was even listed as such. (2) Paper 40 highlights how conceptual and linguistic issues can be difficult to separate, a difficulty many papers ignore. (3) The most interesting part of the paper 40 is how microscopic models can be misapplied, a subject ignored by the paper 28 authors in their electronics context. (4) In light of this being a rare thermo paper and my enjoyment of the HPS paper 17, I'd like to bring the following to the readers attention. Joseph Black was the first person to distinguish between temperature and heat. In "Revisiting Blacks Experiments on the Latent Heat of Water," published in The Physics Teacher, Vol. 40, the authors redo two of Blacks 1766 experiments. The authors obtained results that match modern results in one experiment but not the other. They argue that mastering the scientific method requires understanding why experiments sometimes do not work: "failed experiments suitably presented and discussed may be of pedagogical value." Other than the equivocal "may be," it's a nice little five page paper. (5) Paper 40 was the last of the five paper section addressing student misconceptions. The next three papers address women and minorities as distinct subgroups of physics students.



CHAPTER 29 : Women and Minorities


29.1 : Paper 41, Women's Responses


P. Laws, P. Rosborough, and F. Poodry, "Women's responses to an activity-based introductory physics program," PER Am. J. Phys. Suppl. 67(7), S32-S37 (1999)


The introduction to this reference lists some PER curricula such as Tools for Scientific Thinking developed at Tufts, and RealTime Physics developed at the University of Oregon. It presents data from a separate study that women are "drawn to the sort of knowledge that emerges from firsthand observation" and that educators should "...stress collaboration over debate." In Dickinson College's Workshop Physics courses over 40% of the students are women. The article describes the design of the calculus-based Workshop Physics curriculum. It discusses student learning and attitudes in these courses, and addresses the experience of women. Since 1988, all introductory physics courses at Dickinson College have been workshops.

Learning basic scientific inquiry skills is more important than surveying a large number of topics. Basic inquiry skills are cooperative and activity-centered. Observation, direct experience, and computer use build the physical intuition necessary to understand vital concepts. One reason inquiry skills are needed is that students do not have enough experience with everyday phenomena to tie concrete experiences to scientific explanations. Further, the field of knowledge is expanding; thus the only viable learning strategy is to acquire independent investigative skills implementable at need. While lectures and demonstrations are useful alternatives to reading for transmitting information and teaching specific skills, they do not help students learn to reason, conduct scientific inquiry, or acquire direct experience with natural phenomena. To facilitate original thinking and problem solving, time is better spent in direct inquiry with peers. Instructors focus on creating a learning environment, enhanced by computer usage, in which direct inquiry is productive.

Students meet three times a week, and cover twenty-seven units in the course of a year. A normal weekday session is two hours long and involves an instructor, two undergraduate teaching assistants, and up to 24 students. The Workshop labs are open to students during evening and weekend hours. The traditional content has been reduced by about 25%, although electronics and nonlinear dynamics were added. The workshops use the four-part learning sequence by David Kolb (preconception, qualitative observation, development of definitions and theories, and quantitative experimentation to verify mathematical theories). Workshops involve students pitching baseballs, whacking bowling balls with rubber hammers, breaking pine boards with their bare hands, building electronic circuits, and igniting paper by compressing air.

Student learning and attitudes have been measured by numerous instruments, most recently by the MPEX. Workshop Physics is a very positive curriculum with 50% to 90% of its students able to answer counterintuitive questions that only 5% to 40% of students in traditional instruction master. Further, Dickinson College's Workshop Physics students rate a whole range of learning experiences more highly than their cohorts taking traditional courses. Still, there are students unhappy with the workshop method, and the authors are attempting to understand why.

At Dickinson College, 34 of the 89 physics graduates in the last decade were women. One of the more dramatic differences between men and women is the improved attitude (Likert scale) of freshmen and sophomore women toward computer usage. After one semester, this attitude average jumped from 2.5 to 4.0 on a 5 point scale. In contrast, freshmen and sophomore men went from 3.7 to 3.9 on the same scale. Unfortunately, the other dramatic jump was a drop. In feelings about laboratory work, junior and senior women went from 3.0 to 2.0 on the 5 point scale. This is in contrast to junior and senior men who went from 2.4 to 3.4 on the same scale. Anonymous interviews with fifteen current and former, junior and senior female students revealed that women complained of: domineering partners, fears that their partners didn't respect them, and feelings that their partners understood far more than they. Women also complained about excessive and uncertain time demands. Some participants commented that women participate in extracurricular activities more often than do men, and that having to return to lab at night when an experiment wasn't working was stressful for women. Finally, premeds, women more than men, have been encouraged to view learning as straightforward fact gathering or memorization. This view of learning is in conflict with inquiry. One interviewee sums it up well:


I found in my class that the upper classmen were more frustrated than the freshmen were because you came in and you had other science classes where you'd been taught in a traditional way and they expect you to learn in a totally different way and it's frustrating. I'm a pre-med and I've talked to a lot of other people who are premed and we all felt the same way. We've been conditioned to learn in a certain kind of way and we weren't learning that way. When you come in as a freshmen, I think it's easier for them. And usually freshmen are carrying all 100-level classes. I didn't have extra time to spend worrying about it.


In paper 41's conclusion, several other papers are referenced in the areas of motivation, grade orientation, and different understandings of the nature of learning. Two studies are referenced. First a Harvard poll shows 41% of upper-class women and only 31% of upper-class men reported involvement with volunteer activities [There are no comments on hours spent at paid work]. Second, a multi-institutional survey revealed that the time demands of Workshop Physics were not greater than introductory physics courses at other institutions. Additional papers are cited in a growing literature: about women being more sensitive than men to the opinions of others, about women lacking intellectual confidence in the sciences after years of socialization, about the greater sensitivity of women to grade stresses and competitions, and about problems due to women being in earlier stages of intellectual development. Finally, the authors advocate exposing women to many courses that encourage reasoning and direct observation early in their schooling.

I really like the idea of igniting paper via compressed air. Paper 41 reiterates how current students have not had enough experience with everyday phenomena to tie concrete experiences to scientific explanations; not enough messing about to tie in with paper 34. The small list of activities and the authors final advocacy, lead me to suggest that Workshop Physics might be a good jumping off point for a high school physics curricula. They mention cutting 25% of the traditional curricula but don't specify what was cut. Electronics is added which should gladden the heart of paper 27's author. Having the labs open nights and weekends, with the expectation of students correcting or finishing work not completed in class, improves on paper 12 in which correcting or finishing was rolled over into the next scheduled lab period. That the authors only offer the vague standard of "counter intuitive questions" by which to judge themselves as a "very positive curriculum" is disturbing; particularly as they don't offer a few sample problems as an appendix. Having two undergraduate TA's and an instructor for twenty-four students is wonderful. There was a lot of good PR to help Dickinson College swallow a fairly bitter pill in its 3.0 to 2.0 drop. Premeds are a large percentage of the introductory physics classes here at UCSC, and I find myself in anecdotal agreement with the paper's quote that they have the perception "that getting an 'A' in physics is critical to future success while a functional knowledge of physics is not critical." A final comment, the IMPAC paper 16 also is quite focused on women (and minority) retention.


29.2 : Paper 42, Women in Physics: A Review


L. McCullough, "Women in Physics: A Review," The Physics Teacher Vol. 40, 86-91 (2002)


Women are very underrepresented in the field of physics. This is bad because to sustain our technological civilization "every one of our future workers" must be prepared in science, engineering, and mathematics. The proportion of students taking physics in high school is a low 27%, but 47% of them are young women; unfortunately girls remain underrepresented in AP physics classes. The 47% shrinks to 31% for physics classes in two-year colleges, with only one in five women taking a calculus-based course. This leaky pipeline outputs only 19% of its physics bachelors degrees to women. One reason for this low representation is the lack of role models, only 16% of the American Association of Physics Teachers are women. Research suggests that female mentors may have more of a positive impact on female students. Both men and women cite poor teaching and extreme competitiveness as reasons why they left science.

Research has developed methods that increase women's participation in science. These include mentoring programs, advisement education, and support programs. Some examples include: Mentor.net, AWIS Mentoring Project, and STEPS; paper 42 has many more listings. Classroom climate sends subtle cues that science is not a women's field with the widely used FCI mentioning rockets, hockey pucks, and cannonballs; "these male-oriented contexts may be negatively affecting women's scores." Problematic classroom behavior includes calling more often on men and asking easier questions of women. "Properly structured group work can be a particular blessing for young women. Unfortunately the wrong group structure can easily negate all these benefits for women." Respect helps everyone and such mnemonics as "Bad Boys Rape Our Young Girls" for resistor color-coding order are extremely disrespectful. While those things that help women also often benefit men, it is important to be aware of the issues affecting women's participation in physics.

Paper 42 is rare in that it mentions why teaching physics in the first place is such a good idea (trained workers are needed by society). It uses the metaphor of a leaky pipeline that D. Goodstein will take issue with in paper 45. The use of "may be negatively affecting women's [FCI] scores" is too vague; if a study exists proving a gender disparity in FCI scores, it should be referenced. The FCI authors claim, in paper 1, that there is no such bias. If such a bias exists, proving that the bias is a result of word choice (hockey pucks) is beyond my skill, but surely a linguistics researcher could be called upon to do a study on the issue. Vague assertions invite ridicule and provide no lasting benefit to either side of a debate. Ignored in paper 42 is the idea that the low percentage of female participation in physics may in fact be the result of women choosing other better fields, rather than being denied opportunities to succeed in physics. Technological jobs and training often make costly social demands on their participants. Such participants are labeled geeks for a reason. A recent (spring 2002) series in the San Jose Mercury newspaper highlighted the costs of being an electrical engineer in a start up company; 70+ hour work weeks and missing important social events involving your children such as their little league baseball games, head a list that many people would rationally choose to forgo. Rather than worry about so few women, perhaps we should worry about why so many men deliberately train for such a life. While research "suggests" that female mentors "may have" a more positive impact for female students, M. Schneider in "Encouragement of Women Physics Majors at Grinnell College: A Case Study," published in The Physics Teacher Vol. 39 makes a very powerful statement in favor of assuming personal responsibility to do whatever the job takes; the job being retaining and encouraging female participation in physics. Although not immediately evident in the paper, M. Schneider is a man. At Grinnell College, roughly 50% of the physics majors are women. A strong distinction between retaining and recruiting is made; there is no external recruitment due to a philosophical desire not to rob Peter to pay Paul. The upcoming paper is the last for our women and minority chapter. It focuses on minorities at the City College of New York.


29.3 : Paper 43, Reform in a Multi-Native Language Environment


R. Steinberg, and K. Donnelly, "PER-Based Reform at a Multicultural Institution," The Physics Teacher Vol. 40, 108-114 (2002)


Most PER-based successful reform strategies require significant interactions among students. Given the premium on language and interpretation of context, it is reasonable to ask how such reforms work in a multicultural, multi-native language environment as found at the City College of New York (CCNY). One non-Newtonian source of students initial knowledge is the television, where for example: the Starship Enterprise slowed to a stop because its engine was turned off. Our students faithfully reproduce the same. A lifetime of experiences pushing boxes and riding in cars is not dismissed for the sake of a memorized equation, even if we tell students explicitly to do so. The same student who uses Newton's Third Law on an exam, does not use it when asked about colliding trucks and cars outside of class. Homework often fails to build a conceptual framework; it is often reduced to finding formula with the right combination of symbols or finding a similar worked-out problem.

CCNY has students from over 90 countries. More than half of the introductory calculus-based physics students are from other countries and/or have learned English as a second language (ESL). About two-thirds of the students have taken high school physics. Both "Tutorials in Introductory Physics" and ILD are used in one class at CCNY and are compared and contrasted to a Traditional class also at CCNY using pre and post-FCI results. In the reform class, 84 students took the first midterm and 63 took the final; in the traditional class, the numbers were 74 and 50 respectively. While the drop rates were typical, the drop rate was less for the reform class. One of two recitations a week was given over to Tutorials, and ILD took about 20% of the lecture time. Thus, the reform class had fewer opportunities to review standard homework problems and did not see all the demonstrations and derivations in comparison to the traditional class. The reform was replacement not supplemental, with the total time for both classes being the same.

Class results are similar to previously published results with the traditional class having a <g> of 0.23 and the reform, <g> = 0.43. Within the traditional class, native English speakers (NES) <g> = 0.26, and ESL students <g> = 0.21. Within the reform class NES <g> = 0.46 and ESL <g> = 0.42. These results suggest native language is not a significant factor. Please note that all students spoke English in class. On four common exam questions, there were no "significant or consistent differences" between NES and ESL. The reform class outscored the traditional on both qualitative and quantitative problems; "apparently spending the extra time helping students to understand the underlying ideas of kinematics helped them succeed on quantitative problem solving." As quality of instruction is often judged by student opinion, even though the correlation between opinion and the real success of a course is dubious, it is nice to see the enthusiasm of the students. IDL got a 4.7 and Tutorials a 4.5 out of 5.0 on the question "How effective do you feel each item below was in helping you to understand physics?" In comparison, laboratories got a 3.2 and textbooks a 3.8.

The introduction to this paper is a bit jarring, and its recapitulation of PER is not as helpful to PER readers as some language background would have been. Still, I like the Starship Enterprise example and the acknowledgment of T.V. as a source of student misconceptions. This is the only reference of which I'm aware that addresses dropout rates as distinct from failure rates; a singularity I find odd given the preoccupation high school personnel have on the subject. The dropout rate for the traditional class was 32% and for reform 25%. While the reform class rate is lower, a standard deviation and a mean for all traditional classes over the last couple of years would have made the given percentages more meaningful to the reader. Deciding that differences in <g> between NES and ESL students of 0.04 or 0.05 were "not significant" is odd given Hakes standard deviation is 0.04 for traditional classes (paper 2). Hake's 0.14 standard deviation for reform classes is immaterial in this case, as the 0.14 is attributed to variations not present in this comparison. In this case, all students were subjected to the same I.E. option with the same implementation effectiveness. This study's attempt to hold the total time spent on physics the same for traditional and reform classes is a strong positive. D. Holcomb, writing in Spring 1999's Forum on Education published by the American Physical Society, tells us what he considers good PER and by what criteria he judges whether to pay attention to a particular study. Of the five listed, he highlights one: the need for a real control group which spends the same amount of time on the topic as does the experimental group even if using a different pedagogy. While our authors don't break time spent out by topic, they at least hold total time equal. It would be interesting to do a comparison of various introductory physics classes to see how much time is spent in the classroom, and whether there are required lab and/or discussion section time commitments. At UCSC, introductory classes meet 70 minutes a day, 3 days a week, for a ten week quarter which totals roughly 34 hours with a bit of variation due to holidays. The required accompanying lab meets 3 hours a day, eight times a quarter for 24 hours. The available discussion sections are not required. The lecture and lab are separate for-credit classes, 5 quarter units for the lecture and 1 for the lab. Leaving our women and minorities chapter, we turn our focus to Awards. The upcoming four papers provide some insight into the development and focus of PER as well as shine a light on some of PER's more prominent practitioners.



CHAPTER 30 : Awards for Excellence in Teaching


30.1 : Paper 44, F. Reif's Millikan Lecture 1994

Prefacing Reif's paper is an introduction by R. Alley which provides a brief description of the award and of the man. Since 1963, the Robert A. Millikan Medal has been awarded at the AAPT Summer Meeting "for notable and creative contributions to the teaching of physics." The 1994 Millikan Medalist is Frederick Reif, Distinguished-Senior Professor in the Center for Innovation in Learning and in the Departments of Physics and Psychology at Carnegie Mellon University.


F. Reif, "Millikan Lecture 1994: Understanding and teaching important scientific thought processes," Am. J. Phys. 63(1), 17-32 (1995)


Even students with good grades in their basic physics courses have scientific misconceptions, poor problem-solving skills, retained pre-scientific beliefs, and an inability to apply what they have ostensibly learned. One is left with two questions: why is this the case? and what can we do to change this dismal state of affairs? The instructional problem is to transform the student from his initial state into the desired final state. This requires: (1) the desired final state of abilities and observable performance be clear and specific, (2) the initial state's characteristics and performance be described, (3) the students desired final state and different standards be compared, thus revealing tacit knowledge, learning difficulties, and deficiencies in instruction, and (4) the design of a practical, effective learning process to effect the transformation. This approach requires understanding the thought processes leading to the desired performance.

Reif's central instructional goal is to help students acquire a "modest amount of basic knowledge which they can flexibly use." Flexible utility is centrally important because science is the ability to use a small amount of basic knowledge to predict or explain many diverse phenomena. It is further important because this knowledge must retain its usefulness in a complex and rapidly changing world. The cognitive abilities required to ensure that scientific knowledge can be flexibly used include: interpretation, description, organization, analysis, construction of solutions, and checking of those solutions. Interpretation is the process of unambiguously applying a general abstract scientific concept to any particular instant.


For example, suppose that somebody tells me that a triangle is a three-sided polygon. However, the person cannot recognize a triangle among some other geometric figures, nor construct a triangle with three sticks. Then I would say that the person has some nominal knowledge about a 'triangle,' but does not know how to interpret this concept.


As an example of observed interpretation deficiencies, Reif focuses on acceleration. His basic assertion is that someone able to interpret the concept 'acceleration' should be able to identify the acceleration of a particle in various specific cases. One such case is:


An oscillating pendulum bob which is momentarily at rest at extreme

point A of its circular arc, passes point B with increasing speed, reaches

its maximum speed at its lowest point C where the string is vertical,

continues past the point D, and is again momentarily at rest at point

E. [Draw the acceleration vectors at each point on Fig Z. Please actually

do this prior to reading further.]


In the hopes of encouraging you to read this reference, I'll tell you that the answer is given as Fig. 3 in paper 44. I will also pass on two pieces of information: (1) of the 124 students who studied acceleration in an introductory physics course, not one could answer this problem correctly, and (2) nine out of eleven graduate students answered this question incorrectly on a Ph.D. qualifying examination. One reason for the observed interpretation deficiencies is that students retrieve remembered or plausible knowledge fragments which are often incorrect and which are rarely checked against a definition of the concept. For example, many students deem it obvious that a particle's acceleration is zero, when its velocity is zero. Even when students do invoke the definition of a concept, they often are unable to interpret it properly. For example, a student in reference to point A stated:


The velocity is zero, so the acceleration has to be zero. Because acceleration equals the change in velocity over the change in time...I mean, acceleration is the derivative of the velocity over time. And the derivative of velocity is zero.


Cognitive analysis emphasizes procedural knowledge which specifies what one must actually do to unambiguously construct the concept in any particular instance. For example, acceleration's defining statement is a = dv/dt, but its interpretive method (procedure) is a five step process:

(1) Original velocity v. Identify the velocity of the particle at the time, t, of interest.

(2) New velocity . Identify the velocity of the particle at a slightly later time, t´.

(3) Change of velocity Dv. Find the velocity change Dv = - v of the particle during the small time interval Dt = t´ - t.

(4) Average acceleration, aav. Find the ratio Dv/Dt, the "average acceleration" of the particle during time Dt.

(5) Acceleration, a. Determine the limiting value approached by the average acceleration if the time t´ is chosen very close to t... the resultant ratio dv/dt is then called the "acceleration of the particle at the time t."

This five step complexity is hidden in the defining statement. The five step process is a formal method that ensures reliable accuracy, but it can be quite laborious.

Efficiency, the ability to interpret a concept rapidly and with little mental effort, is essential for effective performance. The key to efficiency is to build a repertoire of knowledge about special cases of the concept (such as circular motion with constant speed). When an encountered situation is a special case, it can be interpreted automatically, without resorting to definition or derivation. Instructors must check that students can interpret a concept before they use it in more demanding problem-solving tasks. Reif presents a three step strategy for teaching a concept: (a) after motivating and introducing the concept, specify it explicitly together with the associated method required for its interpretation; (b) let students themselves apply this method consistently in various special cases, including cases prone to error; and (c) ask students to summarize the results of their concept interpretations in these special cases so that they acquire a useful repertoire of compiled knowledge.

Efficient case-specific knowledge is prone to error. Fine discriminations or validity considerations are all too easy to ignore. Furthermore, a students' everyday common sense is not subject to the stringent requirements of a scientific intuition. This common

sense can easily corrupt the scientific intuition. For example, what is your

first quick efficient intuition as to the angle between vectors A and B? [Please

look at Fig Y before reading on].

If your first glance said sixty degrees, you are not alone but you should look again. Thus, it is essential that one be able and willing to check whether intuitively-applied compiled knowledge has, in fact, been correctly applied.

Knowledge may be described in terms of different concepts, with different symbolic representations (words, pictures, math) and with different degrees of precision. These different descriptions of the same knowledge are NOT equal. A given task may be facilitated by one description but hindered by another. Thus, we must know which descriptions are useful for which tasks, and know methods for implementing these descriptions. Both quantitative and qualitative descriptions are essential for scientific work; description is also important in problem solving. Interpretation of a concept requires all the ingredients be properly described including the concomitant knowledge. For example, in accordance with the 5 step method to find acceleration, one must be able to describe velocities of a particle at two neighboring times and the difference of these velocities. Thus, one needs to know that velocities are tangent to the path, and how to subtract vectors. One must also have the words "tangent" and "vector" as pre-instruction vocabulary words.

Newton's Law Ftot = ma is Reif's example of description. He notes that it's far from trivial and often inadequately performed. For example, only 20 out of 79 students correctly answered his example problem even after a tutorial on forces presented by Shaffer and McDermott at the University of Washington. Reif presents a Cognitive Analysis which increases the success rate of students to 90%. This analysis "transcends" conventional free-body diagrams. Reif notes that quantitative descriptions and qualitative descriptions of a task are complementary. He speaks at length and then quotes Einstein, Feynman, and ends with Hans Bethe who comments:


From Fermi I learned... to look at things qualitatively first and understand the problem physically before putting a lot of formulas on paper. ...Fermi was as much an experimenter as a theorist, and the mathematical solution was for him more confirmation of his understanding of a problem than the basis of it.


The instructional implications are: (1) teach explicit methods generating the descriptions necessary for interpreting scientific principles; (2) teach explicitly the prerequisite knowledge required for description and interpretation; and (3) emphasize both quantitative and qualitative descriptions which are balanced and complementary. Balanced complementary instruction can be provided by: (a) embedding quantitative discussions in qualitative frameworks; (b) solving qualitative as well as quantitative problems; and (c) using qualitative checks and dependencies.

Knowledge must be organized so as to facilitate its effective use. The scientific knowledge organization of students is often quite incoherent. This incoherence is made manifest by paradoxes the students find unresolvable. Physicists believe that physics requires little memorization; they rely on their highly coherent knowledge organization to infer detailed information. The fragmented knowledge organization of a student provides no such benefit. While student generated concept maps are better than isolated knowledge bits, teacher provided hierarchical explicit knowledge organization can help the student easily and accurately retrieve his scientific knowledge. Tests have shown that in comparison to linear structure, a hierarchical structure allows students to better remember information, to modify it, and to detect errors in it. The instructional implications go beyond providing physics knowledge to the student in a coherent hierarchical manner. The most difficult requirement is to ensure that the knowledge in the student's head is well organized. To this end, a student could be given a compact summary of basic knowledge and then asked to use it, in solving and justifying a set of problems.

Problem solving is essential to attain the scientific goals of explaining, predicting, and designing. There are two basic difficulties: (1) how does one initially describe and analyze a problem so as to identify potentially useful action choices? and (2) out of all possible action choices, most of which lead nowhere, how does one make judicious decisions so as to select an action sequence that leads to the desired goal? The most common method of teaching problem solving in physics is that of example and practice. This method is flawed to the point of being labeled "unwise" by the author; Reif goes on to suggest a heuristic strategy that is far more effective. The three major steps in this heuristic problem solving strategy are initial problem analysis, construction of a solution, and checking solutions. As an example of initial problem analysis, the author turns a standard word problem into system diagrams; he also notes that the initial analysis of a problem can greatly affect the ease of its solution. Construction of a solution involves recursively looking for useful sub problems and implementing a solution through the use of a hierarchical knowledge base. This implementation is facilitated by previously acquired interpretation and description knowledge. The key to checking solutions is to revise it, if any deficiencies are detected; some standard checks are: (1) goals attained, (2) well specified, (3) self-consistent, (4) consistent with other information, and (5) optimal.

Reif suggests having students analyze some problems without solving them, thereby expressly practicing the skills of analysis, description, constructing appropriate diagrams, and clearly specifying the problem goals. Reif also proposes giving students a well-organized summary of a very few basic relations and requiring that all solutions be based on these alone. He recommends distinguishing between wrong and nonsensical answers, providing a more sever penalty to the latter. Reif notes that the various cognitive issues in his paper need to be addressed jointly in actual instruction and that there is a large gap between small-scale studies and practical education delivery. Practical education implementation relies on other people, and thus, is facing difficulties. To help achieve practical education delivery, he has written a textbook with a necessary complimentary workbook to actively engage the student. Reif strongly states that "no learning can occur unless the students engage in active thinking." To promote active thinking, he relies on his workbook, interactive lectures, collaborative small group discussion sections, and frequent diagnostic tests. In support of his method, he offers a common exam problem on which his students gave correct answers 45% of the time, and traditional class students gave correct answers 10% of the time. He notes that 70% of his students "reasoned correctly" but that there were some "minor" algebraic mistakes.

Students have misconceptions about the physical world that are difficult to change, but even more difficult to change are their misconceptions about the goals of science and the kinds of thinking science requires. It is not all number crunching, and resentment can be aroused if the learning goals of the teacher and students are mismatched. Using the goals of science and the ways of thinking useful in science as the framework in which more specific scientific knowledge and methods are embedded helps reduce teacher and student mismatches. The real challenge is a cost-effective method of providing guidance and feedback to students. Individual tutoring by instructors is much more effective than homework, which too often merely perpetuates bad-habits such as haphazard use of formula. It's uncertain but possible that computers or trained collaborative learning might solve this serious challenge.

I really enjoyed Reif's five step process to find an acceleration, it exemplifies how physicists use math. His emphasis on problem solving being essential to scientific goals is a good balance to PER's qualitative bias. Having students check their solutions as a process step is great; it reminds me of an anecdotal story from my middle school teaching days. An eighth grader had completed a word problem and had used a calculator. He proudly told me that the man was 800 years old. I asked him whether he knew anybody who was 800 years old. Aggrieved and irritated he said: "That's what the calculator told me." While doing math, it is important to keep one's brain engaged. While it may be the man is 800 years old, it is highly improbable, and all aspects of the problem should be reviewed; if no error is found, an expert should be consulted. The blind acceptance of the calculator's authority undermined any hope of getting this particular young man to perform, or even want to perform, a reality check. So I certainly agree with Reif's desire to differentiate between wrong and nonsensical answers. He mentions giving frequent diagnostic tests; I'm curious about the time cost, and how the time lost to testing is made up. At Cornell, I'm told by an alumnus, all testing is done outside of class at a testing center. This certainly frees up several hours for additional in-class learning. The author's "ostensibly learned" comment brings to mind a recent occurrence. I TA an upper division lab here at UCSC; a group of four students doing a gamma ray absorption lab were unable to tell which shielding plates were made of lead and which of graphite. They did not know that graphite is carbon nor that lead is a malleable metal. They did know lead was heavy, but were confused because the plates were not the same dimensions. Adding to the confusion was their misconception about pencil "lead". A final comment to tie in with Reif's writing of a textbook. One of the few other reform textbooks was written by Arnold Arons. P. Blanton in "Lessons Learned from Arnold Arons" published in The Physics Teacher Vol. 39 pays tribute to the deceased Arons. Arons was a pioneer in physics education, a former AAPT president, and an Oersted Medal winner. He left two very significant instructional manuals for physics teachers: (1) A Guide to Introductory Physics Teaching, and (2) Teaching Introductory Physics. The upcoming paper is by another Oersted Medal winner. It has a very rare focus on why we should teach physics; after all, if we only need a few expert physicists we're already doing a good job. Some of the best physicists in modern times were educated in American Universities and now work in them.


30.2 : Paper 45, D. Goodstein's Oersted Medal Speech 1999


D. Goodstein, "`Now Boarding: The Flight from Physics' David Goodstein's acceptance speech for the 1999 Oersted Medal presented by the American Association of Physics Teachers, 11 January 1999," Am. J. Phys. 67(3), 183-186 (1999)


Today, the profession of teaching physics has only two purposes: one is to turn out physicists, and the second is to act as gatekeeper, keeping a few poor souls out of medical school. We need to revaluate our jobs. We physicists understand, in very large measure, how the world works. To live in ignorance of this understanding should be intolerable for an educated person. The undergraduate physics major is the liberal arts education of the 21st century. The methods, textbooks, and language we currently use are useless to this revaluation. They are designed to get rid of the unworthy, not to throw open the doors to the multitudes. The first very small step to reform, indeed the key to teaching anything, is to remember what it was like to not understand that thing. To view our understanding as obvious - to anyone worth teaching - is not even remotely helpful.

The educating of the masses, not just the elites, started long ago. From 1900 to 1940, forty-two states passed compulsory school attendance laws for all children up to 16 or eighteen years of age. These laws were considered by many Europeans of the time as "fantastic" and "ridiculous," because at least 90% of any population must gain its livelihood in "manual or service pursuits." The G.I. Bill of Rights and the Higher Education Act of 1965 carried the masses to college, long a reserve for the elites, and so, for the first few decades after WWII, there was explosive growth in the academic world. Today, nearly two-thirds of all American high school graduates go on to college. In fact, this explosion was the tail end of an exponential growth pattern starting in the 1700's in Europe. Exponential growth cannot continue forever, and for physicists, it ended in the 1970's. Unfortunately, all of our American academic patterns implicitly assume exponential growth. To maintain the current level of professorships, each current professor needs to train as his replacement only one graduate student. In our system, a professor averages fifteen graduate students over his career, each of these in turn generates fifteen graduate students, each....; this is the definition of exponential growth. Even with the advent of Post Doc's, we still have run out of jobs, which is false of course. The high schools are begging for competent science teachers. High school teaching, however, is not the job a Ph.D. has his heart set on. It is a use for those B.S. physics degrees that should be our new focus. The golden years ending in the 1970's have left us three legacies: (1) the best scientists in the world, (2) a large number of foreign graduate students, and (3) an education system reminiscent of diamond mining. It is this education system that deliberately throws out so much "common human debris" that we are called upon to change. The education system must be changed, in part because it fostered onto us the paradox of the Scientific Elites and the Scientific Illiterates, and in part because the conditions of the past that allowed for exponential growth to exist are irrevocably gone. The immense expansion of the academic world that soaked up most of the earlier Ph.D.'s is over, forever.

Paper 45 is well worth reading. It has an enjoyable discussion as to why "diamond mining" is a much better metaphor than a "leaky pipe" is, for our current physics education system. Goodstein's focus on correcting our societal burden of Scientific Illiteracy is more personally motivating than is a focus on worker training. Essentially, one needs a reason to teach the masses, for PER to matter; we already generate quite competent Scientific Elites. A. Hobson in "Science Literacy and Departmental Priorities" published in Am. J. Phys. 67(3) is the only other person to address the root motivation for PER: teach the masses physics. Hobson argues the priorities of any large physics department are faculty research first followed by (in order of priority): Ph.D. students, Master's students, upper-level undergraduate majors and courses, introductory courses for majors, introductory courses for other scientists and engineers, and finally lowest of the low and often entirely absent, physics for non-scientists. Non-students are entirely absent. This priority list must be stood on its head. A fundamental problem of our times is the scientific illiteracy of the general population. This illiteracy threatens the foundation of our industrialized and democratic society. Physics departments should make it a top priority that 50% - 75% of all nonscience undergraduates take a science-literacy physics course oriented toward scientific methodology and the connections between physics and society. One roadblock to physics departments conducting science-literacy outreach to the general public is the department's research-oriented hiring and tenure practices. New faculty are never selected primarily for their teaching skills and must subordinate everything to outstanding research performance to keep their jobs.


30.3 : Paper 46, A. Van Heuvelen's Millikan Lecture 1999

A. Van Heuvelen is the 1999 Millikan Medalist. He is a Professor of Physics at Ohio State University. He has developed the Overview Case Study (OCS) method of instruction and the Active Learning Problem Sheets (ALPS Kits). He co-authored ActivPhysics. Dr. Van Heuvelen is currently with the Ohio State University Physics Education Research Group which had seven graduate students in 1999. He has also authored Experiment Problems for Mechanics, and Experiment Problems for Electricity and Magnetism.


A. Van Heuvelen, "Millikan Lecture 1999: The Workplace, Student Minds, and Physics Learning Systems," Am. J. Phys. 69(11), 1139-1146 (2001)


Heuvelen discusses: the desired outcomes for our education, the student's mind, and the features of learning systems that help students achieve the desired outcomes. He presents an analogy comparing our educational system to a high fidelity speaker. The student's mind is analogous to sound waves. The conceptual and procedural knowledge we would like the students to know is analogous to electrical oscillations. The matching of impedances and the importance of smooth transitions at interfaces is stressed. This model helps identify three goals for physics education. First is our choice of conceptual knowledge, process skills, and personal characteristics that we want students to acquire. Second, we must determine the characteristic impedance of the student mind. Third, we must build an education system which matches impedances and smooths transitions.

The desired outcomes come from two sources: Bloom's Taxonomy and some recent (1999) workplace studies. Heuvelen presents Bloom's Taxonomy in Table 1 of paper 46. Bloom's six educational objectives in ascending hierarchical order are: (1) knowledge, (2) comprehension, (3) application, (4) analysis, (5) synthesis, and (6) evaluation. Heuvelen contends that traditional courses deal only with the first three objectives, and focus mostly on mere knowledge. He then poses two rhetorical questions: whether real science should place "more emphasis on the higher level cognitive skills?" and whether the lower levels are "important in the practice of real science?" He then "seeks" answers to these questions in several workplace studies. Heuvelen's "possible" list of educational objectives are: (a) learn the skills needed to solve real problems; (b) learn to design and conduct scientific investigations; (c) develop the skills needed to design a system, a component, or a process; (d) develop the ability to function effectively on a multidisciplinary team; (e) learn skills needed to engage in life long learning; and (f) learn to communicate effectively. These objectives come out of three workplace studies and are underpinned by Bloom's Taxonomy. The workplace studies are: (1) "Shaping the Future" by the National Science Foundation, (2) The ABET Engineering Criteria 2000, and (3) an American Institute of Physics (AIP) survey. The AIP survey found that 82% of B.S. physics graduates have final careers in industry and government. The top three skills needed in those workplaces being: problem solving, team work, and communication skills. Physics knowledge is the least used skill because most former physics students work in fields other than physics. Heuvelen concludes "we could meet our students needs better by going into greater depth with reduced content."

The Nature of the Mind and Matching Impedance is the next major section of paper 46. The author asserts that humans are pattern-recognition animals, who try to match their new experiences to previous events. Given that the symbolic representations of physics are not common previous events, we end up with problems. In fact, most common previous events are "picture like representations that emphasize qualitative features." It is, of course, possible to learn abstract languages, but studies in linguistics have emphasized five points: (a) the need for referents, (b) the requirement for multiple exposures, (c) the benefits of multiple representations, (d) the helpfulness of interactive simulations, and (e) the critical importance of starting early. Brain research has shown how detrimental aging is to the development of new synapses and how ingrained old patterns become. Referents are actual objects that provide concrete grounding for abstractions. Whether a ball given to a kid at the same time the work "ball" is spoken, or the use of Feynman diagrams as a more concrete referent for QED interaction processes, referent use helps the initial learning process. The less abstract the referent the better. Multiple exposures are a necessity. Arons has said that students need six or more exposures over an extended time interval and in a variety of contexts to master a new concept. Multiple representations help students learn in several ways; even later in their development, "Jeopardy" questions (paper 20) are beneficial. Heuvelen offers an example of a single question presented by five very different methods: words, pictorial, diagrammatic, graphical, and mathematical. Interactive simultaneous representations are a feature of "Active Physics" and Van Heuvelen views this use of computers as particularly useful in answering "what-if questions." He quotes Reusser: "Computers... are ideally suited to providing both representational and procedural facilitation to student's understanding." The key to starting early is the realization that as a child, the physics student had already mastered a very abstract language and that his ability to acquire new languages deteriorated with age. The author's style is to make his point with questions, I'll provide an example from this topic:


Pediatric neurologist Harry Chugani said, "...who is the idiot that decided that students should learn new languages in high school?" Should we make a similar statement about physics learning?


While no rhetorical answer is provided, figure 7 in paper 46 shows a strong drop off in new language fluency at around ten years of age.

Having addressed both the desired outcomes and the student's mind, we now address the learning system needed to help the mind achieve the outcome. For the system impedance to match the student mind's impedance are the issues spoken to earlier, notably the need of referents, multiple representations, and multiple exposures. For the system impedance to match the desired outcomes impedance are the points to be made. The points to be incorporated into a learning system are: problem solving, design, epistemology, active learning, teamwork, and learning to learn. The problem solving should be focused on real world problems that are poorly defined, that consist of multiple smaller problems, and that require an organized structure of conceptual knowledge to solve. An example, the spring launch experimental problem is given; references to additional free problems are also given. Finally, these real life problems are contrasted to stereotypical end-of-chapter prechewed problems. Design is an upper level Bloom's Taxonomy educational objective and one of the most frequent activities of physicists in the workplace. In response to this, at Ohio State University, students design their own experiments to determine some property of a system. In the epistemology section, the author presents traditional science education as ready-made knowledge based on the student's belief in authority (professor, TA, textbook). He implies an effective method of inquiry based on the students observing and modeling real phenomena would be a big improvement over traditional science education. In an approach attributed to Etkina, the student "finds" experimental relationships, proposes laws, and applies them to "interesting" problems. Active learning is important because studies show that we remember only 3% of what we hear, and Hake has shown large gains in active engagement classes over lecture-based instruction. Teamwork is one of the highest priorities that the "real world" needs from education. Teamwork also promotes learning; Johnson found that cooperative learning classes had almost a grade point higher achievement over traditional classes. Learning to learn is admitted to be a fuzzy goal, but one that allows people to "perform effectively when situations are unpredictable and tasks demand change". A table with some suggestions for integrating learn-to-learn strategies into conventional instruction is given.

There are six points with which I take issue. First is his answer to why teach physics; he appeals to both Bloom's Taxonomy and Workplace Studies. Having sat through six years of secondary school teacher in-services, I have seen Bloom's Taxonomy used to justify too many weird or failed ideas. In fact, Bloom became old hat and has been supplanted by more modern theories in educational circles. The appeal to workplace studies brings facts and a valid perspective to the motivation debate; however, it is a perspective with which I fundamentally disagree. I do not believe education's primary focus should be job preparation. Second, while we may well need or choose to reduce content coverage. Justifying this choice by showing most holders of a B.S. in Physics work outside the field is akin to not teaching lawyers criminal law because most of them will not be criminal justice specialists. Third, use of student grades to compare instructional programs, particularly on a small scale is worthless. Independent of instructional program, instructors have wildly varying grade criteria. I know an instructor who would not fail anyone who attended class, in an attempt to combat dropout rates. I know another who failed between a third and a half of his classes regularly, in an attempt to uphold standards. Fourth, Heuvelen's comment on poor lecture gains (on the FCI presumably) is invalid given the papers on ILD (#9), APF (#10), and PI (#11). Fifth, his quote from education research that people remember only 3% of what they hear meshes with data from my aforementioned in-services. However, for completeness, people remember roughly 10% of what they see and up to 30% of what they both see and hear. They retain about 50% of what they write and around 70% of what they actually do with their bodies (body kinesthetics). Taking notes during a lecture that uses over-heads and a chalkboard is not a waste of time, though admittedly not as good as body kinesthetics. There are time and money constraints on any real world system, and hands-on activities are not always an option. Even the substantially restructured physics course at Colgate University uses bubble chamber photos rather than the real thing (paper 29). Rarely is one only listening to a lecture that provides no visual stimulation. [It is important to remember that "retain" means short term retention in all cases including Heuvelen's; long term retention almost always requires multiple exposures no matter what specific learning modality is used]. Sixth, this business about concrete referents is valuable, but how to achieve the shift from concrete to abstract is left too vague.


30.4 : Paper 47, L. McDermott's Oersted Medal Lecture 2001

The Oersted Medal is AAPT's most prestigious award and has been given annually at the AAPT Winter Meeting since 1936. This award recognizes individuals who have made "notable contributions to the teaching of physics." The 2001 Oersted Medal was awarded to Lillian C. McDermott, Professor of Physics at the University of Washington, leader of its Physics Education Group (PEG). Professor McDermott has directed the PEG for twenty-five years. Sixteen graduate students earned a Ph.D. in physics for research in physics education through the PEG. Two kinds of research-based instructional materials have been developed at the PEG: "Tutorials in Introductory Physics" and "Physics by Inquiry". The PEG has been the model for many programs developed by both her graduates and her many visitors.


L. McDermott, "Oersted Medal Lecture 2001: `Physics Education Research - The Key to Student Learning'," Am. J. Phys. 69(11), 1127-1137 (2001)


Research on the learning and teaching of physics is essential for cumulative improvement in physics instruction. Pursuing this goal through systematic research is efficient and greatly increases the likelihood that innovations will be effective beyond a particular instructor or institutional setting. The perspective taken is that teaching is a science as well as an art. Research conducted by physicists who are actively engaged in teaching can be the key to setting high (get realistic) standards, to helping students meet expectations, and to assessing the extent to which real learning takes place.


Investigations of student understanding of one dimensional kinematics started in 1973 and led to two published papers in 1980 and 1981. These papers were the first physics education research (PER) based papers to appear in the American Journal of Physics. PER emphasizes student understanding of scientific content, not educational theory or methodology in the general sense. For both intellectual and practical reasons, discipline-based education research should be conducted by science faculty within science departments. Millikan characterized science as "a body of factual knowledge accepted by all workers in the field." Richtmyer goes on to note that thus "one must admit that in no sense can teaching be considered a science." McDermott is building a body of the first, and seeks to refute the second. The Physics Education Group (PEG) treats research on the learning and teaching of physics as an empirical applied science. The results of their replicable, systematic, reported investigations support the premise that teaching can be considered a science.

Two basic results are that student knowledge is practically instructor independent in "equivalent physics courses" and that there are a limited number of conceptual and reasoning student difficulties. These difficulties can be identified, analyzed, and addressed through an iterative process of research, curriculum development, and instruction. To the extent that student difficulties and effective strategies are generalizable and the results reproducible is to the extent which they are a "reasonable foundation of accepted fact." Publicly shared knowledge that provides a basis for the acquisition of new knowledge is characteristic of science. To the extent that faculty are willing to draw upon and contribute to this publicly shared knowledge is to the extent that teaching can be treated as a science. The PEG's primary criteria for effectiveness of instruction is the assessment of student learning in terms of specified intellectual outcomes. If student learning (as distinct from enthusiasm) is the criteria, then the PEG has found that effective teaching is not tightly linked to the motivational effect of the lecturer, to student evaluations of the course, to student evaluation of the instructor, nor to self-assessment of learning by students. The PEG focuses on the student-as-learner not the instructor-as-teacher. They study various populations from K-12 teachers of physics and physical science, up through physics graduate students. Their two primary research methods are individual demonstration interviews that probe deeply and widely administered written tests that detail prevalence. An important part of the process is a comparison of student performance on corresponding pre and post-tests. The results are used to guide curriculum development [as an example, see paper 37]. Day-to-day interaction in nonstandard small classes has provided the opportunity to observe the intellectual struggles of students, to detail specific difficulties, to experiment with different instructional strategies, and to monitor their effect on student learning.

One of the PEG's conceptual questions is one in which a student is asked to rank the brightness of bulbs in Figure A; when the batteries and bulbs are ideal, and the bulbs identical.

This question reveals two widespread mistaken beliefs. First, the battery is a constant current source, and second, current is "used up" in the circuit. Only 15% of over 1000 students gave the correct ranking on both the pre and post-tests. Students do much better on the post-test if they are taught via "Physics by Inquiry". In "Physics by Inquiry", a student must go step-by-step through the reasoning needed both to overcome conceptual hurdles, and to build a consistent coherent framework. "Physics by Inquiry" emphasizes collaborative learning and peer instruction. It also stresses explanations of reasoning. The instructor does not lecture but poses questions that help the student arrive at their own answers.

Few students develop a functional understanding of physics, despite faculty giving lucid explanations, demonstrations, and illustrating problem-solving procedures. A functional understanding is the ability to interpret and use physics in situations different from those in which it was acquired. Among the many problems are (1) professors viewing students as younger versions of themselves, and (2) a type of problem solving that reinforces the detrimental perception that physics is a collection of facts and formulas with the key to solving physics problems being finding the "right" equation. Typical introductory physics classes are large, rapid paced, and cover a wide breadth of material, all of which preclude the use of "Physics by Inquiry". "Tutorials in Introductory Physics" is an attempt to use some of the important features of "Physics by Inquiry" to supplement the typical course.

McDermott presents six "research-based generalizations on student learning" matched to six "research-based generalizations on teaching." The six student learning generalizations are illustrated via examples from optics, notably single-slit diffraction and double-slit interference. The six teaching generalizations are each expanded into several paragraphs of specifics. After listing the six and six, I will flesh out a couple to provide some feel for her presentation. The following are the six matched pairs. (1) Facility in solving standard quantitative problems is not adequate criterion for functional understanding. Questions that require qualitative reasoning and verbal explanations are essential for assessing student learning and are an effective strategy for helping students learn. (2) Connections among concepts, formal representations, and the real world, are often lacking after traditional instruction. Students need repeated practice in interpreting physics formalism and relating it to the real world. (3) Certain conceptual difficulties are not overcome by traditional instruction, and advanced study may not increase understanding of basic concepts. Persistent conceptual difficulties must be explicitly addressed in multiple contexts. (4) A coherent conceptual framework is not typically an outcome of traditional instruction. Students need to participate in the process of constructing qualitative models and applying these models to predict and explain real-world phenomena. (5) Growth in reasoning ability often does not result from traditional instruction. Scientific reasoning skills must be expressly cultivated. And (6) teaching-by-telling is an ineffective mode of instruction for most students. Students must be intellectually active to develop a functional understanding.

McDermott fleshes out the idea that facility in solving standard quantitative problems is not an adequate criteria for functional understanding by using an example from optics. Two problems (Fig. B1 and Fig. B2) which are essentially the same, were given to 130 students. For the quantitative problem (Fig. B1), 85% of students said there was a minima, and 70% calculated q Y14º, the correct angle. For the qualitative problem (Fig. B2), 45% got the correct comparison of a > l, but only 10% gave a correct explanation. Since minima are visible the angle to the first minima is less than 90º; also a*sinq = l ; therefore as sinq < 1, a > l.

McDermott fleshes out the idea that persistent conceptual difficulties must be explicitly addressed in multiple contexts; in the process, she advances her instructional strategy. Merely warning students about errors is ineffective. Avoiding situations likely to evoke errors simply conceals latent difficulties that will surface at some later time. She uses a strategy of "elicit, confront, and resolve" in multiple attempts that also necessitate opportunities to "apply, reflect, and generalize." She also takes issue with the term "misconception research" believing that it trivializes a problem that is a symptom of fundamental confusion.

In her last major section, she applies the earlier research-based generalization to the development of curriculum. The curriculum is "Tutorials in Introductory Physics." The subsections are (a) description of the tutorials, (b) preparation of tutorial instructors, (c) supplementary instruction by guided inquiry: example from physical optics, (d) assessment of student learning, and (e) effectiveness of the tutorials. The Tutorials are a guided inquiry experience. A tutorial instructor guides students through the necessary reasoning by posing questions. The students work in groups of three or four. The Tutorials consist of pretest, worksheet, homework, and post-test. They are not designed to transmit information, nor to build standard problem solving skills. Rather, they construct concepts, develop reasoning skills, and relate the formalism of physics to the real world. Thus, the Tutorials develop functional understanding in supplement to textbooks and lectures. The preparation of tutorial instructors is primarily through weekly seminars which are conducted on the same material and in the same manner that the tutorial instructors are expected to teach. Assessment of student learning is primarily through pre and post-tests. A tutorial is judged successful if the student's post-test score matches or exceeds the tutorial instructors' pretest scores from their weekly seminar. Examples of both pre and post-tests are provided. Tutorials are effective; they have a very positive impact on students' ability to do qualitative problems. Students also do somewhat better on quantitative problems and have a higher retention rate than for standard instruction.

McDermott comments that there is no need to reinvent the wheel; she urges instructors to use existing curriculum that has been thoroughly evaluated. She concludes: (1) a focus on qualitative understanding is setting a higher standard than that tacitly accepted in the past by traditional classes. (2) Research is the key to student learning. And (3) there is increasing acceptance within the physics community that PER is "an appropriate field for scholarly inquiry by faculty in physics departments" as resolved by the May 1999 Council of the American Physical Society.

The last sentence is a beautiful segue into my last chapter of Part III, Defense of PER. Before we get there, there are a couple auxiliary papers to mention and a commentary to perform. The first auxiliary paper is McDermott's Millikan Lecture Award paper published in the Am. J. Phys. 59(4). While it is similar to this one, there are some worthwhile variations to note. Instead of optics, her examples involve mechanics, particularly the Atwood Machine. She provides a very concise definition of the constructivist approach to teaching and speaks extensively on some of the dangers associated with computer usage. One such danger is that computers often avoid error rather than confront it; this results in the error persisting in non-computer environments. The other auxiliary paper is her Atwoods Machine paper published in Am. J. Phys. 62(1). She found at least half of all students do not understand the concept of tension. A major part in this failure is the general inability to distinguish tension from weight.

Her comments about student learning not correlating to student evaluations of the course, the instructor, or themselves is unusual, but it does have supporters in the literature (paper 11). This lack of correlation between learning and a student's self-evaluation of his learning environment does seemingly conflict with some MPEX data which suggest students engage in intellectual damage control and reduce their efforts if they don't have a positive evaluation of their environment (paper 27). Of course effort is not learning, but one hopes for a strong correlation between the two. I'd like to know how she defines retention. No other paper documents a positive cascade effect or even improved long-term retention; in fact, several papers document the opposite (papers 13, 16, 28). The methodology employed by her instructors is very reminiscent of SDI (paper 12). The use of the tutorials as a supplement raises the old question from paper 17 (HPS): if it's needed and effective, why not use it in the first place? Reif, in paper 44 gave an example problem in which only 20 out of 79 tutorial students taught by McDermott's Physics Education Group answered correctly. Reif, I'd venture to guess, would not agree with McDermott's de-emphasizing educational theory or methodology. Redish certainly would disagree as his entire paper (30) is a call for a general theoretical framework and against "collecting data into a wizard's book of everything." McDermott starts her paper with some quotes, and while she is building a "reasonable foundation of accepted fact", there are more than a few physicists not included in the "all workers in the field." Paper 13 notes for us the practicalities behind the words "same" and "ideal" in setting up the light bulb question out in the real world. Although we're not told, such considerations (weak batteries, etc.) may be part of the instructor training she quite correctly insists on. Which raises the point, why does she inform us that Tutorials are "instructor independent"? If nothing else, the lower bound would be adversely affected by poorly prepared instructors. Further, I find it sad that an exceptional teacher means nothing if the curriculum is good. She uses "science instructor" quite often instead of "physics instructor". This linguistic shift causes me to wonder whether there is a CER or even a BER to accompany PER. If there are such things as Chemistry Educational Research or Biology Educational Research, why are they ignored in physics educational research papers; if they do not exist, why not? McDermott states that PER "increases the likelihood that innovations will be effective beyond a particular instructor or institution". Preserving knowledge and disseminating it to self-actuated recipients is not enough. Bower in "Scientists and Science Education Reform: Myths, Methods, and Madness" available at http://www.nas.edu/rise/back2ga.hrm, discusses a twelve week elementary school science curriculum. One of his several points is that "there is little evidence" that training courses have much effect outside the classroom of the trained teacher. Real teachers seldom have the means, or the time, to support or transform the teaching techniques of their colleagues. Trained teachers are not enough. Long-term in-district support, money, and follow-up are necessary for science reform to succeed. There are ten pages worth of additional material in Bower's paper; it obviously ties in nicely with paper 18's advocacy of catching prospective physics students when they're young (a thinner layer of prejudice to penetrate) and with paper 46's advocacy to start early because of brain development biology. The point is single, isolated, trained, motivated teachers often to always fail to change large bureaucracies. The Just Do It group of paper 29 only had to deal with six faculty and no engineering department; it still took them ten years to start on the second course of their Introduction to Physics curriculum! So while PER may be necessary, it is by no means sufficient to effect wide scale change in how physics is taught. At press time, I just noticed that McDermott matches a student's post-test to an instructor's pretest. I always had assumed the instructors took the same test (the post-test); they just did so prior to any instruction provided at the weekly seminars. Thus, the pre-instruction table for the instructors and the post-instruction (post-test) table for the students were referring to the same actual physical test applied in different relationship to instruction. It seems to me now that this is not the case and two seprate, similar but not identical tests are being compared in McDermott's definition of success. Thus, "success" becomes intertwined with just exactly how similar the two separate tests are.



CHAPTER 31 : Defense of PER


31.1 : Paper 48, Who Needs Physics Education Research!?

PER has its detractors and its internal rivalries, most of which are only faintly reflected in the literature. The upcoming two papers and two letters to the editor offer a defense of PER, and by inference, some glimpses at the positions of PER detractors. Few, if any, detractors see fit to invest the time to write and publish an anti-PER paper; although they may send in a letter to the editor or make some side comments in a speech focused on "real" physics. Astronomy was born out of Astrology, and Chemistry out of Alchemy. All science has "messy" beginnings; PER is merely young.


D. Hestenes, "Who needs physics education research!?," Am. J. Phys. 66(6), 465-467 (1998)


PER is a credible discipline with a body of reliable empirical evidence, clarified research issues, and able researchers. It is a serious program that applies to our teaching, the same scientific standards we use in physics research. Unfortunately, most of our colleagues are oblivious and some who aren't are contemptuous. David Griffiths' concerns about the FCI are important and shared by thoughtful physics teachers. It would be a poor sort of educational research that fails to address them. The database of the FCI is enormous with more than 20,000 students and 300 physics classes spanning the range from high school to graduate school. This database is so broad and involves so many other physics instructors and researchers that the unsettling message it brings can no longer by attributed to bias or incompetence of the original investigators. Griffiths questions the validity of the FCI and the urgency of the results. This skepticism is based on general doubts about multiple-choice tests and "his own arm-chair analysis of test items." Anecdotal evidence is no more adequate in education than it is in science. The FCI has been carefully validated with extensive student interviews. The carefully constructed distracters for each item are not typical multiple-choice throwaways, but common sense alternatives to Newtonian concepts that amplify the significance of student responses.

Item 19 of the FCI has been dropped in a recent minor revision with the ironic consequences of slightly lowering FCI scores. As acknowledged in the original article, item 19 was defective but sought to measure the important superposition principle which we felt was too important to ignore. Unfortunately, we have not devised a satisfactory replacement. FCI questions deliberately avoid the technical, precise, unambiguous language of physics. Too often students respond to the form of the technical language rather than its meaning. For example, in one survey 80% of students could state Newton's Third Law even though only 15% fully understood it as measured by the FCI. Validation interviews confirm that Newtonian Thinkers are able to resolve the consequent imprecision and ambiguities arising from the avoidance of technical language.

The reform movement has used FCI data as compelling evidence that there are serious problems with physics instruction. The FCI data is far from the only such evidence. There is huge PER literature on student misconceptions which support the same conclusion. Lillian McDermott has documented the huge gap between what teachers think they are teaching and what students are actually learning, by methods other than FCI use. The FCI sets a minimal standard for effectiveness of instruction in Newtonian Mechanics. It is a discrimination test that forces a student to choose between basic Newtonian concepts and naive alternatives. Griffiths wonders whether we should expect a student to meet this minimal standard in a first course, arguing that real understanding takes a long time to mature. "Real understanding" aside, even students with average FCI scores have learned more wrong and misleading material than they have enlightening. To the extent students have not mastered the material in the FCI, is to the extent they will systematically misinterpret what they hear and read in a physics course. They will treat the technical language of physics as muddled jargon, and they will be forced to resort to rote methods of learning and problem solving. Practice makes permanent, and mindless plug-and-chug without concepts is counter-productive, not perfect. Griffiths is mistaken in believing reformers advocate "teaching to the test." Such an approach fails badly as rote-learning has a half-life of only a few days.

Griffiths rises to the defense of lectures although "there is no evidence that students who attend lectures learn more than those who don't." In fact, the complex cognitive skills required to understand physics cannot be developed by listening to lectures anymore than one can learn to play tennis by watching tennis matches. The tapes of Feynman's Lectures on Physics are the pinnacle of classroom performance. They were expressly prepared for first year physics students at Cal Tech. Feynman himself regarded them as a failure, as only a small fraction of the students really were able to cope with the course. Enough has been observed about the deficiencies and dangers of lecture, to shift the burden of proof to its proponents. Eric Mazur, Alan Van Heuvelen, and Richard Felder have risen to the challenge by modifying the lecture format to actively engage students. They have also assumed responsibility for systematically evaluating their effectiveness.

Griffiths believes "any pedagogical method requires a good teacher, and good teachers are extremely rare." While Hestenes concurs that the general state of physics teaching is amateurish. Relying on his decade of being a PI on a NSF teacher enhancement grant for in-servicing nearly 150 high school physics teachers, he believes good teaching is an acquirable skill. Hestenes sums up paper 48 with a preview of five conclusions from his grants. (1) Teachers with low FCI scores are unable to raise student scores above their own. (2) The best teachers love the challenge of learning something new and are eager to share the experience with students. (3) Managing the quality of classroom discourse is the single most important factor in teaching with interactive engagement methods. (4) Teachers create an environment in which students construct their own understanding. (5) Technical knowledge about teaching and learning is as essential as subject content knowledge.

Several things come to mind. This was a response to Griffiths' paper "Millikan Lecture 1997: Is there a text in this class?" published in Am. J. Phys. 65. Hestenes is one of the authors of the MD (1985), the MB (1992), and the FCI (1992). Hestenes' claim that the FCI has been carefully validated stands in contrast to his 1992 statement in paper 1 that "formal procedures" to validate the FCI are "unnecessary" because of its similarity to the validated MD. The original FCI paper (paper 1) does note the aforementioned weakness of question 19, if in slightly different words; question 29 is co-labeled as "weak discriminators" that "could be dropped from the test". That between 1992 and 1998, there has not been a satisfactory replacement for question 19 is a poor sort of educational research indeed. If language were a large issue, one sign of such could be the formal language MB post-instruction scores bettering the less precisely written FCI post-instruction scores, which is not the case. Hestenes is obviously anti-lecture; his tennis-match analogy, accurate or inaccurate, is beautiful. His nod to the work of Eric Mazur, et al., is a nice balance. His lamenting the amateurish state of teaching reminds me forcibly of paper 31, in which Hammer likens the teaching concepts of most professors to the naive beliefs of most students: strongly held, based on a lifetime of experience, and wrong. Were this the last paper, I could lay claim to coming full circle back to the FCI (paper 1), but there are a few more defenders of PER to attend.


31.2 : Paper 49, How Do We Know If We Are Doing A Good Job?


R. Ehrlich, "How do we know if we are doing a good job in physics teaching?" Am. J. Phys. 70(1), 24-29 (2002)


Paper 49 is based on a talk given at the AAPT summer meeting at which Ehrlich received the 2001 AAPT Award for Excellence in Undergraduate Teaching. The reactions of the audience at the talk were polarized, with some viewing the talk as an attack on the physics education reform movement. The author did not intend it as such. While Ehrlich mentions such evaluative tools as the FCI, the MPEX, retention rates, and student evaluations, he does so in a qualitative fashion and mainly as foils to present his beliefs, and cautions. My summation of content and mood of this article is this: however much reformers fear that the bath water is drowning the kid, traditionalists fear at least as much that in throwing out the bath water, reformists have also managed to toss out the kid.

Paper 49 starts with an anecdote on student evaluations; it quotes them to the end result of a chaotic system in which small initial perturbations can lead to widely divergent results. I personally have interest in this anecdote, as so little teaching, traditional or reform, uses the old recitation method employed by the author. Ehrlich faithfully reports that this employment has its dangers. The author is uneasy and points to a few sources of his discomfort. He dislikes the use of FCI results to justify the death of lectures. He points out some weaknesses of the FCI and some validating data for lectures. He particularly takes issue with lectures being labeled as non-interactive pointing to the work of Mazur, Redish, and Bligh as counter examples. Ehrlich uses the MPEX data by Redish to advance his curiosity as to whether the decrease in student attitude over the lifetime of a course would hold true for subsets of students based on their course grade. While explicitly stating "success in physics knows no gender or racial boundaries," he does hold to the old conventional wisdom that there are "intrinsic" differences in students. These intrinsic differences, independent of type of instruction, enable some students to succeed with little effort, while others fail even after considerable effort. From here, he springboards into disagreeing with the idea that a decent grade is the reward for a conscientious student's the time and effort. Ehrlich argues that failing a class can teach a great deal and be a long term good even if a short-term pain. Failing or passing people is the result of assessment, and the author speaks eloquently against "lowering the bar" for majors while advocating lower standards in conceptual physics courses, which promote greater science literacy among the general population. He also speaks to what it means to do a good job teaching; his bullets are: (1) you and your students respect each other, (2) start the semester with enthusiasm, (3) evaluate and try new teaching methods, (4) keep up with advances in physics, (5) encourage deep understanding, (6) maintain high grading standards, and (7) take student evaluations seriously - but not too seriously. After addressing the individual teacher, he addresses success at the departmental level. In summary your department is doing a good job if it: (a) maintains high academic standards, (b) encourages and welcomes all students, (c) adds new options in the major as needed, (d) tries different forms of instruction, (e) offers students research opportunities, (f) listens to its students, and (g) graduates some majors.

While not a rousing direct attack on or defense of PER, paper 49 does advance opinion on PER subjects such as the lecture. I just don't see the paper matching the implied promise of the first paragraph. Still, it fits no where else better. I agree with the author's comments on failing a student. However, in this, the day of inflated grades, the social costs to the teacher for any large scale failing of students is high, bordering on prohibitive. The key to failing is the assessment tool, which in turn is keyed to what is valued. To not reward effort, independent of accomplishment, is to ensure some students give up. High dropout rates are not accepted as a social goal in this day of the democratic ideal. That some pigs are more equal (or "intrinsic") than others is true; it's also political dynamite. Too many people were excluded for nonphysics reasons in the past. For those who pay the bill, allowing us to discriminate for any reason, justifiable, or not, is anathema. Whether a prephysics course (paper 3) or a lowered-standards conceptual physics course, there is more than an echo of academic elitism that is not sustainable in face of majority distaste. Nobody wants to be failed, and nobody wants to be put in with the dummies. Legacies of our secondary and primary school systems where, to put it mildly, failure is not seen as a long term good. Whether mass physics literacy would be achieved by physics equivalents to Art Appreciation or Art History is open to debate; such courses would be the equivalent of watching the tennis match, not playing on the court. Still, a Physics-Magic show might be worthy of consideration; particularly if its competition in a real student's life is either no physics course at all, or sitting through something disliked and soon forgotten. Art benefits from the involvement of non-artists (viewers, patrons, etc.); physics should also benefit from the involvement of non-physicists. Many non-astronomers like the pictures of planets and stars. Why should our goal of scientific literacy require more physics than apprehension, or more physics than history. There is a general art appreciation / knowledge; the creation of which did not, at the college level, require the average person to actually do art. Watching a tennis match can build a literacy about tennis, even if it won't make you a tennis player. Instead of a watered down majors curriculum, most students should take courses geared toward making physics fun, memorable, and interesting; not complete, integrated, and mind numbingly overwhelming. My sister leads a very pleasant life with little to no scientific knowledge. A few neat physics tricks with simple (incomplete, even wrong) explanations to amuse her kids might interest her and them. We live in a world of specialization; people do not have to know how to design a car, to drive it. In fact, many car enthusiasts don't even want to know the boring details; speed and exterior shape are quite sufficient motivators. What level of scientific literacy are we seeking? Given the student focus of PER, perhaps the physics departments should offer a wider range of appreciation and history classes and see if anybody not already in the physics track shows up to say 'hi, this looks fun'. It was the counter-intuitive, interesting fun that brought me back to physics after a ten year hiatus as an electrical engineer, homeless, and math teacher. In the final analysis, even when aware of physic's immense contribution to civilization itself, it's the fun that makes it worth the bother.


31.3 : Paper 50, Impressions and Importance of PER


S. Lamoreaux, "Impressions of Physics Education," Am. J. Phys. 69(6), 633 (2001); in review with: J. Stith, D. Campbell, P. Laws, E. Mazur, W. Buck, D. Kirk, "Importance of Physics Education Research," Am. J. Phys. 70(1), 11 (2001)


Two letters to the editor, the second a public response to the first. This combination will probably be unique as they stimulated the AJP to change its policy on letters to the editor. In the first, Lamoreaux argues that while Tutorials might help the average student, they harm the best students. He offers an example of one such student. He goes on and argues that PER should make funds available for public independent assessment and ends with a warning that "better" is not the same as "good". The authors et al. of the second letter definitely outnumber the original offender. They are the National Visiting Committee for the University of Washington's Physics Education Group. The National Science Foundation has charged them with oversight of PEG's NSF grant. I suspect you're getting the idea; a better example of bureaucratic defense has not been published in PER literature. If nothing else, these letters show the reason why PER should be defended from both its enemies and its friends.



Chapter 32 : Conclusion to a Thesis


So, if you've read the whole thing, congrats, you're probably the third person to have done so. :-) feel free to add your john handcock to any others on the bottom of this page. Rather than repeat my previous conclusions, let me end with a sentence from the very beginning paragraph:


If you have knowledge to advance the art of teaching, share it.



Bibliography




Assimakopoulos, P. "A Computer-Aided Introductory Course in Electricity and Magnetism," Computing in Science and Engineering Nov/Dec 2000, 88-94 (2000)


Bagno, E., B. Eylon, U. Gamiel, “From fragmented knowledge to a knowledge structure: Linking the domains of mechanics and electromagnetism,” PER Am. J. Phys. Suppl. 68(7), S16-S26 (2000)


Bao. L., and E. Redish, “Concentration analysis: A quantitative assessment of student states,” PER Am. J. Phys. Suppl. 69(7), S45-S53 (2001)


Beichner, R., L. Bernold, E. Burniston, P. Dail, R. Felder, J. Gastineau, M. Gjertsen, and J. Risley, “Case study of the physics component of an integrated curriculum,” PER Am. J. Phys. Suppl. 67(7), S16-S24 (1999)


Bonham, S., R. Beichner, and D. Deardorff, “Online Homework: Does it Make a Difference?,” The Physics Teacher 39, 293-296 (2001)


Bower, J., “Scientists and Science Education Reform: Myths, Methods, and Madness,” "http://www.nas.edu/riese/backg2a.htm" 10 pages


Christian, W., “Educational Software and the Sisyphus Effect,” Computing in Science and Engineering May-June 1999, 13-15 (1999)


Colin, P., and L. Viennot, “Using two models in optics: students’ difficulties and suggestions for teaching,” PER Am. J. Phys. Suppl. 69(7), S36-S44 (2001)


Crouch, C., and E. Mazur, “Peer Instruction: Ten years of experience and results,”” Am. J. Phys. 69(9), 970-977 (2001)


Cummings, K., J. Marx, R. Thornton, D. Kuhl, “Evaluating innovations in studio physics,” PER Am. J. Phys. Suppl. 67(7), S38-S44 (1999)


Ehrlich, R., “How do we know if we are doing a good job in physics teaching?,” Am. J. Phys. 70(1), 24-29 (2002)


Elby, A., “Another reason that physics students learn by rote,” PER Am. J. Phys. Suppl. 67(7), S52-S57 (1999)


Elby, A., “Helping physics students learn how to learn,” PER Am. J. Phys. Suppl. 69(7), S54-S64(2001)


Galili, I., and A. Hazan, “The Influence of an historically oriented course on students’ content knowledge in optics evaluated by means of facets-schemes analysis,” PER Am. J. Phys. Suppl. 68(7), S3-S15 (2000)


Goodstein, D., “Now Boarding the Flight from Physics, David Goodstein’s Acceptance Speech for the 1999 Oersted Medal presented by the American Association of Physics Teachers, 11 January 1999,” Am. J. Phys. 67(3), 183-186 (1999)


Güémez, G., C. Fiolhais and M. Fiolhais, "Revisiting Black's Experiments on the Latent Heat of Water," The Physics Teacher Vol. 40, 26-31 (2002)


Hake, R., “Interactive-engagement versus traditional methods: A six-thousand-student survey of mechanics test data for introductory physics courses,” Am. J. Phys. 66(1), 64-74 (1998)


Hake, R., “Socratic Pedagogy in the Introductory Physics Laboratory,” The Physics Teacher Vol. 30, 546-552 (1992)


Halloun, I., and D. Hestenes, “The initial knowledge state of college physics students”, Am. J. Phys. 53(11), 1043-1055 (1985)


Hammer, D., “More than misconceptions: Multiple perspectives on student knowledge and reasoning, and an appropriate role for education research,” Am. J. Phys. 64(10), 1316-1325 (1996)


Hammer, D., “Student resources for Learning,” PER Am. J. Phys. Suppl. 68(7), S52-S59 (2000)


Harrington, R., “Discovering the reasoning behind the words: An example from electrostatics,” PER Am. J. Phys. Suppl. 67(7), S58-S59 (1999)


Henry, D., “Resource Letter: TE-1: Teaching electronics,” Am. J. Phys. 70(1), 14-23 (2002)


Hestenes, D., M. Wells, and G. Swackhamer, “Force Concept Inventory,” The Physics Teacher Vol.

30, 141-157 (1992)


Hestenes, D., and M. Wells, “A Mechanics Baseline Test”, The Physics Teacher Vol. 30, 159-166 (1992)


Hestenes, D., “Who needs physics education research !?,” Am. J. Phys. 66(6), 465-467 (1998)


Heuvelen, A., “Millikan Lecture 1999: The Workplace, Student Minds, and Physics Learning Systems,” Am. J. Phys. 69(11), 1139-1146 (2001)


Heuvelen, A., and D. Maloney, “Playing Physics Jeopardy,” Am. J. Phys. 67(3), 252-256 (1999)


Hobson, A., “Science Literacy and Departmental Priorities,” Am. J. Phys. 67(3), 177 (1999)


Holbrow, C., J. Amato, E. Galvez, and J. Lloyd, “Modernizing introductory physics,” Am. J. Phys. 63, 1078-1090 (1995)


Johnson, M., “Facilitating high quality student practice in introductory physics,” PER Am. J. Phys.

Suppl. 69(7), S2-S11 (2001)


Kalman, C., S. Morris, C. Cottin, R. Gordon, “Promoting conceptual change using collaborative groups in quantitative gateway courses,” PER Am. J. Phys. Suppl. 67(7), S45-S51 (1999)


Kirkpatrick, L., “American Association of Physics Teachers 2001 Oersted Medalist: Lillian C. McDermott,” Am. J. Phys. 69(11), 1126 (2001)


Laws, P., P. Rosborough, and F. Poodry, Women's responses to an activity-based introductory physics program,” PER Am. J. Phys. Suppl. 67(7), S52-S37 (1995)


Lindenfeld, P., “Format and content in introductory physics,” Am. J. Phys. 70(1), 12-13 (2002)


LiPreste, M., “A Comment on Teaching Modern Physics,” The Physics Teacher Vol. 39, 262 (2001)


Loverude, M., C. Kautz, and P. Heron, “Student understanding of the first law of thermodynamics: Relating work to the adiabatic compression of an ideal gas,” Am. J. Phys. 70(2), 137-148 (2002)


Maloney, D., T. O’Kuma, C. Hieggelke, A. Heuvelen, “Surveying students' conceptual knowledge of electricity and magnetism,” PER Am. J. Phys. Suppl. 69(7), S12-S23 (2001)


Marshal, J., and J. Dorward, “Inquiring experiences as a lecture supplement for preservice elementary teachers and general education students,” PER Am. J. Phys. Suppl. 68(7), S27-S37 (2000)


McCullough, L., “Women in Physics: A Review,” The Physics Teacher Vol. 40, 86-114 (2002)


McDermott, L., "Millikan Lecture 1990: What we teach and what is learned -- Closing the gap," Am. J. Phys. 59(4), 301-315 (1991)


McDermott, L., “Oersted Medal Lecture 2001: Physics Education Research – The Key to Student Learning,” Am. J. Phys. 69(11), 1127-1137 (2001)


Poulis, J., C. Massen, E. Rubens, and M. Gilbert, “Physics lecturing with audience paced feedback,” Am. J. Phys. 66(5), 439-441 (1998)


Redish, E., “The Implications of Cognitive Studies for Teaching Physics,” Am. J. Phys. 62(6), 796-803 (1994)


Redish, E., J. Saul and R. Steinberg, “Student expectations in introductory physics,” Am. J. Phys. 66(3), 212-224 (1998)


Reif, F., “Millikan Lecture 1994: Understanding and teaching important scientific thought processes,” Am. J. Phys. 63(1), 17-32 (1995)


Scherr,R., P. Shaffer, and S. Vokos, “Student understanding of time in special relativity: Simultaneity and reference frames,” PER Am. J. Phys. Suppl. 69(7), S24-S35 (2001)


Schneider, M., “Encouragement of Women Physics Majors at Grinnell College: A Case Study,” The Physics Teacher 39, 280-282 (2001)


Sokoloff, D., and R. Thornton, “Using Interactive Lecture Demonstrations to Create an Active Learning Environment,” The Physics Teacher Vol. 35, 340-347 (1997)


Stannard, R., “Communicating physics through story,” Physics Education, 30-34 (2001)


Steinberg, R., “Computers in teaching science: To simulate or not to simulate?,” PER Am. J. Phys. Suppl. 68(7), S37-S41 (2000)


Steinberg, R., and K. Donnelly, “PER-Based Reform at a Multicultural Institution,” The Physics Teacher Vol. 40, 108-114 (2002)


Styer, D., “The Word “Force”, Am. J. Phys. 69(6), 631-632 (2001)


Thacker, B., U. Ganiel and D. Boys, “Macroscopic phenomena and microscopic processes: Student understanding of transients in direct current electric circuits,” PER Am. J. Phys. Suppl. 67(7), S25-S31 (1999)


Thornton, R., and D. Sokoloff, “Assessing student learning of Newton’s laws: The Force and Motion Conceptual Evaluation and The Evaluation of Active Learning Laboratory and Lecture Curricula,” Am. J. Phys. 66(4), 338-352 (1998)


Usher, T., and P. Dixon, “Physics goes practical,” Am. J. Phys. 70(1), 30-36 (2002)


Vokos, S., P. Shaffer, B. Ambrose, L. McDermott, “Student understanding of the wave nature of matter: Diffraction and interference of particles,” PER Am. J. Phys. Suppl. 68(7), S42-S51 (2000)


Wosilait, K., P. Heron, P. Shaffer, L. McDermott, “Addressing student difficulties in applying a wave model to the interference and diffraction of light,” PER Am. J. Phys. Suppl. 67(7), S5-S15 (1999)


Yeo, S., and M. Zadnik, “Introductory Thermal Concept Evaluation: Assessing Students’ Understanding,” The Physics Teacher Vol. 39, 496-504 (2001)

65