Grant Wiggins on Grading

Transcription

Grant Wiggins on Grading
Grant Wiggins on Grading
Grades
1
Provocations...
• What is a grade’s purpose? What follows for what it
should represent, regardless of tradition?
• Should we - would we - give grades if we didn’t have to?
• Is it acceptable for teachers at the same grade and
teaching the same course to view the same or similar
student work differently? What follows for grading
policy and practice?
• Why calculate the ‘mean’ when it penalizes progress and
over-rewards inconsistency?
• Why isn’t every public school giving grades against state
standards if those standards are obligatory?
• Why do parents like grades? Why is this interest sound,
even if teachers don’t like giving them?
2
Looking closely at habit
• Always tricky: habit runs so deep, we
rationalize it without realizing it
• We ask that you work hard to keep an
open mind and resist the ‘Yes, but...”
reaction that is inevitable
3
1
Grant Wiggins on Grading
Essential Qs of Grading
• Why grade? Why not?
• Audience: For whom are we grading?
• Purpose: What is the primary purpose of a
grade? (What if our purposes conflict?)
• What should we grade? What shouldn’t we
grade?
• How should a grade be determined?
• How consistent should grades and grading
policy be (across time and across teachers)?
• How is quality control in grading best
achieved?
4
An indication of unthinking
and harmful habit
• I received the following email:
 “I was meeting with our high school Advanced
Placement Teachers, who were expressing
concerns about our open enrollment process
and the high failure rate. One math teacher
said that while a particular student was now
making grades in the 80's, she had made a 12
on an initial test, “so there is no way she's going
to make a passing grade for the first nine
weeks.”
5
Other countries
• In Belgium, France, Morocco, Portugal, Peru,
Venezuela, Iran and Tunisia a 20-point
grading scale is used, in which 20 is the
highest grade and 0 is the lowest.
• The "passing" grade is usually 10
• Grades of 10-11 is "adequate".
• Grades of 12 or 13 are "passable"(better than
adequate)
• Grades of 14 to 15 are "good" (better than "passable")
• Grades of 16 to 17 are regarded as excellent and
outstanding, respectively. From this point on, you have
truly mastered the course.
• Grades of 18 to 19 are nearing perfection.
6
• Grades of 20 are just perfect.
2
Grant Wiggins on Grading
Denmark
•
•
•
•
•
•
•
•
•
•
•
The current Danish scale is called the 13-scale and consists of
10 grades ranging from 00 to 13, with 00 being the worst.
00 completely unacceptable performance.
03 very hesitant, very insufficient and unsatisfactory
performance.
5 hesitant and not satisfactory performance.
6 just acceptable performance.
7 mediocre performance, slightly below average.
8 average performance.
9 good performance, a little above average.
10 excellent but not particularly independent performance.
11independent and excellent performance.
13 exceptionally independent and excellent performance.
7
Where is it written that
we must, as teachers,…
• Give only one (aggregate )grade to each
student in our class, even if the transcript
only permits one grade?
• Calculate grades using the mean score (as
opposed to, say, the median or the trend)?
• Grade each student in a diverse classroom
against the same standard?
• Keep different weighting and grading
variables constant all year?
• Assess and grade all students at the same
point in time (thereby making it impossible8to
do a thorough and timely job of grading)?
Judging grading policies
and practices
• No progress without sound criteria:
 We need criteria by which we can
evaluate grading policies and practices in
a pedagogically wise and defensible
manner
 Only agreed-upon criteria can free us
from the tyranny of bad habits that we
try to rationalize
9
3
Grant Wiggins on Grading
Criteria for any grading
system to meet:
Honest feedback about one’s standing
Fair to each student and other students
Transparent and without mystery
Credible to clients & constituencies
Valid assessment against key long-term
learning goals
• Useful (actionable) and user-friendly
information about performance and how
to improve
• Pedagogically wise - it sends the right
message and gets the incentives right for 10
learners
•
•
•
•
•
Exercise: So, what
follows?
• In small groups, we’ll analyze some
practical implications for individual
and school-wide grading, given each
criterion:






Fair
Honest
Transparent
Credible
Useful & User-Friendly
Pedagogically wise
11
Symbols and their
meaning
• A big problem may not be “grades” but a
single grade, hiding vast differences in
performance, habits, and attitudes!
 Why do we only put single grades on report
cards and transcripts when disaggregated data
would greatly help the reader understand what
the learner has accomplished and who the
learner is as a learner?
12
4
Grant Wiggins on Grading
Break out the Independent
Variables in Grading
• Validity, transparency, and usefulness
require knowing at least these 3 elements,
broken out:
 What is my level of achievement?
 What has been my progress (against
standards)?
 How are my work habits and attitudes?
14
Independent Variables in
Grading
• Most Grading Systems conceal more than
they reveal!
 Achievement (against standards)
 Sub-achievement (discrete competencies
making up overall achievement)
 Progress over time (against standards)
 Habits and attitudes (effort, open to
learning, etc.)
15
5
Grant Wiggins on Grading
Problems with the
(Single) Grade
• Different Work, Same Grade
 Average Achievement, Great Progress
vs. High Achievement, No progress
 Different sub-scores on multiple
rubrics, SAME AVERAGE
 Wild swings vs. consistency - SAME
AVERAGE
16
sub-grades vital in a
subject
• “Yes, I know I got a ‘B’ in Language
Arts, but break it down for me” by –
Grades against each state standard
Control over varied genres of reading
Control over varied genres of writing
Each criterion used in rubrics related to
my writing
 The quality of my participation




17
Some implications,
explored in this conference
• Thus, we are not against grades! We are against
dumb grading systems!
 Conventional summary grades,
calculated by computing the
arithmetical ‘mean’, are
indefensible (despite the
longstanding habit of doing so)
 Failure to give grades against state
standards is irresponsible
 Providing only a single grade is
18
unhelpful
6
Grant Wiggins on Grading
Beyond the Mean: it
hides vital information
• The “average” (the mean) hides or underrewards:
 Trend of work over time - progress
 Consistency of work quality
 Key feedback on independent variables
making up the grade
 Degree of true mastery against an
objective standard
19
Beyond the mean: Other
Methods of “Averaging”
• Beyond the mean:
 Reliable pattern (median, mode)
 Consistency (range, Standard Deviation.)
 Throw out the highest and lowest scores
(Olympics)
 Factor in degree of difficulty of the tasks
(as in music, diving, gymnastics)
20
Beyond the ‘average’:
Other Methods for
Determining Grades
• think of diving, karate, chess:
 Total score over time (e.g. Olympics)
 Final mastery level (chess, karate,
ACTFL)
 Measure of progress against standards
over time (pre/post, the value-added)
21
7
Grant Wiggins on Grading
Varying the weighting
and honoring difference
• Why penalize early effort or prior level?
 Don’t grade (or average in) all formative work
 INITIALLY grade Effort and Progress more than
level of achievement in the first month(s),
gradually making level of achievement more
significant
 Have LEVELS in your class, via pre-testing; make it
like skiing - Novice/Intermediate/Expert - where
that designation is given along with the grade - A
work for a novice, but C work for Expert
 Emphasize some rubrics more than others initially,
then give grades against all rubrics by the end
22
Provide sub-grades for better
feedback
• A standards-based report card:
 Grade control over core genres in
Language Arts, state standards in math
and science
 Distinguish between mastery of content
vs. mastery of processes
23
Feedback: on which key
tasks?
• Key to reform: to view assessment in terms
of longer-term learning goals, not just
grades on quizzes of recent content;
• Feedback against –




Key
Key
Key
Key
competencies/exit standards
transfer tasks
habits of mind
long-term inquiries (essential questions)
24
8
Grant Wiggins on Grading
Longitudinal Progress
via Rubrics
• Rubrics to track progress over time
against standards
 On a novice-expert continuum in addition
to rubrics for judging increasing
sophistication of ideas and processes
• cf. Rubrics in ACTFL, American Literacy
Profiles, Meisels’ Work Sampling System,
chess/karate/bridge etc.
25
Longitudinal Progress
via Rubrics
• Too Many Rubrics are Task-Specific
 A reporting and grading system should
align, for tracking progress over time
 Rubrics should be more generalized and
linked to exit-level standards
• Too many grading systems and rubrics
conflate descriptive LEVEL with
judgment about WORK QUALITY
(regardless of level)
26
ACTFL Example
• Novice-Low: Oral production consists of isolated words
and perhaps a few high-frequency phrases. Essentially
no functional communicative ability.
• Novice-Mid: Oral production continues to consist of
isolated words and learned phrases within very
predictable areas of need, although quantity is
increased. Vocabulary is sufficient only for handling
simple, elementary needs and expressing basic
courtesies. Utterances rarely consist of more than two or
three words and show frequent long pauses and
repetition of interlocutor's words…. Some Novice-Mid
speakers will be understood only with great difficulty.
27
9
Grant Wiggins on Grading
ACTFL Example
• Novice-High Able to satisfy partially the
requirements of basic communicative exchanges
by relying heavily on learned utterances but
occasionally expanding these through simple recombinations of their elements…. Shows signs of
spontaneity although this falls short of real
autonomy of expression…. Vocabulary centers on
areas such as basic objects, places, and most
common kinship terms. Pronunciation may still be
strongly influenced by first language. Errors are
frequent and, in spite of repetition, some NoviceHigh speakers will have difficulty being
understood even by sympathetic interlocutors.
28
ACTFL
• Advanced Able to satisfy the requirements of
everyday situations and routine school and work
requirements…. Can narrate and describe with
some details, linking sentences together smoothly.
Can communicate facts and talk casually about
topics of current public and personal interest, using
general vocabulary. Shortcomings can often be
smoothed over by communicative strategies, such as
pause fillers…. Circumlocution which arises from
vocabulary or syntactic limitations very often is quite
successful, though some groping for words may still
be evident. The Advanced-level speaker can be
understood without difficulty by native
interlocutors.
29
UK natl curriculum:
science
•
Level 4 Pupils recognize that scientific ideas are based on evidence. In
their own investigative work, they decide on an appropriate approach
for example, using a fair test to answer a question. Where appropriate,
they describe, or show in the way they perform their task, how to vary
one factor while keeping others the same. Where appropriate, they
make predictions. They select information from sources provided for
them. They select suitable equipment and make a series of observations
and measurements that are adequate for the task. They record their
observations, comparisons and measurements using tables and bar
charts. They begin to plot points to form simple graphs, and use these
graphs to point out and interpret patterns in their data. They begin to
relate their conclusions to these patterns and to scientific knowledge and
understanding, and to communicate them with appropriate scientific
language. They suggest improvements in their work, giving reasons.
30
10
Grant Wiggins on Grading
From UK natl. curriculum
•
Level 8 Pupils give examples of scientific explanations or models that
have had to be changed in the light of additional scientific evidence. They
evaluate and synthesize data from a range of sources. They recognize that
investigating different kinds of scientific questions requires different
strategies, and use scientific knowledge and understanding to select an
appropriate strategy in their own work. They decide which observations
are relevant in qualitative work and include suitable detail in their
records. They decide the level of precision needed in comparisons or
measurements, and collect data enabling them to test relationships
between variables. They identify and begin to explain anomalous
observations and measurements and allow for these when they draw
graphs. They use scientific knowledge and understanding to draw
conclusions from their evidence. They consider graphs and tables of
results critically. They communicate findings and arguments using
appropriate scientific language and conventions, showing awareness of a
range of views.
31
Reading Levels
Assessment
•
•
•
•
•
Lexile Scores
Degrees of Reading Power
Meisel’s Work Sampling System
American Literacy Profiles
UK Reading rubrics
32
33
11
Grant Wiggins on Grading
NO justification for fitting
grades to a bell curve
• True both mathematically and pedagogically
 Mathematics: the point of a curve is to show the
normal distribution of a vast number of
elements, likely to be ‘naturally’ distributed e.g. the cholesterol numbers for an entire
population
 Education: We aim for performance
improvement across the board; a normal curve is
a sign of failure in a small class
• Note: so-called “curving” of a set of test grades is different, and
may reflect a wise adjustment based on the hunch that the raw
grades are not reliable
34
Bloom et al:
Grading on a Curve
• Grading on a curve - unjustified:
 “There is nothing sacred about the
normal curve. It is the distribution most
appropriate to chance and random
activity. Education is a purposeful
activity, and we seek to have the
students learn what we have to teach…”
35
Bloom et al:
Grading on a Curve
 “If we are effective in our instruction, the
distribution of achievement should be very
different from the normal curve. In fact,
we may insist that our efforts have been
unsuccessful to the extent that [grades]
approximate the normal distribution.”
• Evaluation to Improve Learning, p. 53
36
12
Grant Wiggins on Grading
Why fitting to a curve is
wrong - and wrong-headed
• More generally:
 Fitting grades to a normal curve is to
exaggerate student differences and report
against norms not standards.
 We should seek the opposite: let the grades
fall where they may, against worthy
standards.
 Standards and expectations are lowered if
we fail to give disinterested grades against
clear criteria and models
37
Example of “Mike” a high-mobility student
Year
Location
Grade
1989
1990
1991
1992
1992
#
1992
1993
Salem
Vero Beach
Bakersfield
Albuquerque
San Antonio
Los Angeles
Los Angeles
C+
C C
A
A+
CB+
# 1st year of mainstreaming
38
“Mike” (continued)
Year
Location
Score
1989
1990
1991
1992
1992
1992#
1993
Salem
Vero Beach
Bakersfield
Albuquerque
San Antonio
Los Angeles
Los Angeles
268
250
277
341
371
232
318
39
13
Grant Wiggins on Grading
40
41
42
14
Grant Wiggins on Grading
Grading Criteria,
explained
43
Key tension: Honest and
Fair
It is difficult but essential to be both; most
reports err on one side or the other
Honest: a dispassionate account of student
strengths and weaknesses against standards and
normed expectations
Fair: mindful of extenuating circumstances,
personal strengths and challenges, local norms,
and reasonable expectations of that individual
44
Fair
• Honors idiosyncracies and appropriate
extenuating circumstances
 Don’t compare apples and oranges:
• Don’t confuse “behind” with “different
developmental path”
• Don’t emphasize narrow and arbitrary test types
• Be explicit about special-student status
45
15
Grant Wiggins on Grading
Fair
• We work consciously and carefully to
eliminate bias and capriciousness
 Bloom et al.: “Unfortunately, extraneous
factors, such as the student’s speech,
sociability, personality can unconsciously
influence a teacher in grading”
46
Honest
• Faithfully describes absolute levels of
performance
 Honest reports do not flinch from stating the
level of performance against standards, and
making predictions about likely exit status
 Dishonest to report only the strengths - cf.
some narrative and portfolio systems
47
Credible
• Means that it is trustworthy and
believable information for ALL
constituencies - implying:
 disinterested scoring and grading on a
regular basis
 external validation in terms of tasks,
criteria, standards
48
16
Grant Wiggins on Grading
To be credible a grade
(judgment) must be...
 Based on multiple & varied assessment, and
sensitive to MI and learning styles
 Comparable grades/marks
 Given by someone who consistently monitors
based on instruction
 Measuring achievement toward standards
 Inter-rater reliability
 Measured over time
 More than one person has input
49
To be credible a grade
(judgment) must ...
 Be specific about what the child knows and is able to do
 Not confuse what a child does with what a child can do
(“potential” separated from “achievement”)
 Knowing the criteria and standards upon which the grade
was given
 Discrepant data not swept under the rug!
 Fair opportunities to learn
 Observable, measurable data is central to a judgment
50
Credibility: Quality
Control in Grading
• Grades are too often unreliable or
invalid (and inconsistent across
teachers)
 Agreement on “anchors” and grading
criteria needed
 Grading policies and oversight needed
 Need to link grades to external standards
51
17
Grant Wiggins on Grading
Useful
• Refers to the reader’s ability to use
and profit from the report
 Links back to clarity about purpose(s) and
audience(s)
 Not likely to be useful if fails to assess
and evaluate strengths and weaknesses
52
User-Friendly
• Refers to intelligibility and ease of
access of information
 Many reforms of grading and reporting
have not been user-friendly, though highly
valued by teachers
 Few parents understand why letter grades
are inadequate and are suspicious of
reforms
53
User-friendly (feasible)
• Must be manageable for teachers
 We need to rethink our use of time
to ensure rich, timely, and accurate
information
• Why do we unthinkingly report for ALL students
on the same day? Why not stagger report cards
by sections of the alphabet to lighten the load
and ensure better reports?
54
18
Grant Wiggins on Grading
Toward Credible
Grading & Scoring
• Setting standards locally requires:
 on-going faculty group scoring of student
work
 system-wide rubrics and 'anchors' for
shared standards
 on-going study of performance results
leading to recommendations
 validating local grading standards
55
Credible
Valid, supported analysis, using a
standard approach
A trustworthy source, an expert
judgment
Believable, plausible, based on a
careful consideration
56
Achievement, defined
Performance against academic/
intellectual standards
Measure of “pure” performance level,
without extenuating factors or norms
Exemplars, criteria and specifications are
explicit and deliberately referenced
57
19
Grant Wiggins on Grading
Progress, defined
Performance measured over time
against explicit standards
Requires a longitudinal assessment
system, on a novice-expert continuum
We need to ensure there is both pre-testing
and post testing
Longitudinal rubrics
58
Purposes: Assess vs. Evaluate
• Assess: comment on and analyze work
against valid criteria - so as to give
value-neutral feedback & guidance
• Evaluate: summarize implications of
results, by assigning a value to the
overall performance, against some
standard(s) or expectation(s)
59
“Objective” &
“Subjective”
• Human judgments can and must be as
“objective” as possible!
 Objective: disinterested, valid, and reliable
judging against standards
 Subjective (in the bad sense): Biased or
capricious judgment
 Subjective (in the good sense): A human
judgment based on a variety of information, not
mechanically computed
• Beware of defining machine-grading as “objective” and
human scoring as “subjective”
60
60
20
Grant Wiggins on Grading
“Objective” &
“Subjective”
• Thus, to say that local grades are “so
subjective” is misleading
 The issue is sound professional judgment
 We need more disinterested judging locally to
make scores and grades defensible - objective
• E.g. IB, Advanced Placement, group scoring of writing or
music against standards and rubrics, etc.
61
A Mantra:
• Give fewer grades and more feedback
 We tend to give too much evaluation, too
soon
 We tend to “grade” without helping the
student really grasp the feedback the
grade is based on
 Giving better feedback enables the
student to take greater control of giving
themselves (or seeking) advice, sooner
62
Excellent feedback
• The most common answers to our
exercise on good feedback:









Timely
user-friendly - in approach and amount
Descriptive & specific re: performance
Consistent
Expert
Accurate
Honest, yet constructive
Derived from concrete standards
On-going
63
21
Grant Wiggins on Grading
for further
information...
Contact me:
grant@authenticeducation.org
The grading and reporting bulletin
boards at authenticeducation. org
www.bigideas.org Our monthly resource
on a theme related to teaching and
assessing
The archive contains an issue devoted to
grading (winter 2005)
64
22