# My SBG Journey: Mastery Level Calculation Methods

I have written previously about my interest in using the Power Law as a grading method in a Standards Based Grading class. I am still quite interested but I have come across a few other common SBG grading practices and this post will be my way of working through those choices. PowerSchool uses the term Mastery Level Calculation Methods and while I may not use that Learning Management System (LMS) that term just feels like it encapsulates the concept so I will be using that to refer to SBG grading methods from here on out. As a general understanding most SBG uses a 4 point rubric to indicate individual skills and content mastery and is typically buttressed with either oral or written feedback. In a perfect system the rubric/feedback system would be sufficient and no “grades”; however we live in a world that necessitates a grade value. In the schemes I have found most take the individual standard rubric scores (as determined by the teacher’s Mastery Level Calculation Method) and combine them into a single overall score on the 4 point scale. This number is then translated into a grade that would be recognizable on a traditional scale (A-F or 100% scales for example)

Here are the Mastery Level Calculation Methods I have so far come across (in no particular order):

• Highest Score
• Average of Highest 3 scores
• Median or Average
• Mode
• Decaying Average
• Average of the 3 Most Recent
• Power Law

For each method I would like to give a short description and then weigh out the pros and cons. The following table (borrowed from here the Alberquerque Public Schools Power Law explainer) will be used as the example student data for a particular skill category.

Highest Score – In this calculation method the student simply receives the highest score in each skill category regardless of when it was acquired.

Example – Using the data set each student would receive a “4” in this category because each student demonstrated a 4 in this skill category at one point in the grading period.

Pros – About the only pro I can see with this Mastery Level Calculation Method is the simplicity. There are no mathematical formulas or algorithms and it is easily understandable.

Cons – Where to start? The goal of SBG is to show where the students are in the moment. With this scheme a student that does well on the first assessment but poorly from there on out shows the same mastery as students that are showing growth or even consistent skill. It would likely be demotivating once a high level mastery demonstration is achieved. It also does not show whether or not the student understands the skill since it does not encourage repeated demonstration. My biggest fear with this scheme is a regurgitation mentality we see in traditional grading schemes. The student would learn only enough to get by, spew it back on the assessment, and then forget it.

Average of Highest 3 scores – This Mastery Level Calculation Method is similar to the Highest Score. In this calculation The 3 highest skill category scores are averaged together.

Example – Each student in the data set has a 1, 2, 3, and 4. Each student would receive an average of (4+3+2) / 3 which would equal 3

Pros – This does help eliminate some of the negatives associated with the Highest Score calculation. It provides more data points to track the number of peak performances which provides more support for a grade narrative.

Cons – Like the Highest Score calculation though this structure does not explain the skill master in the moment. Student #4 in the data set clearly has a withering skill but will receive the same grade as student #1 who is showing growth in ability. One may argue that this is “fair” but the grade should be indicating a trend and current skill level. Also students 2 and three have not shown consistent ability to demonstrate their skill so the grade narrative needs to be more nuanced than this calculation allows.

Median or Average – In this calculation the total number of points is divided by the total number of entries to determine the average mastery level for all assignments.

Median or Average – In this calculation the total number of points is divided by the total number of entries to determine the average mastery level for all assignments.

Example – Student #1 received (in order) 1 + 2 + 3 + 4 = 10 total points. This is divided by 4 which equals 2.5. Student #4 received (in order) 4 + 3 + 2 + 1 = 10 total points divided by 4 which equals 2.5. Both students receive the same 2.5 overall average.

Pros – This is the most like a traditional gradebook (see cons) and the students will be able to quickly and easily understand the calculation.

Cons This is most like a traditional gradebook. The biggest problem with this setup is that the final number calculated will not adequately describe a student’s master of content in the moment. A goal of standards based grading is to show student growth. In most scenarios students will likely be at lower levels in pre-assessments and through the formative assessment process as they are failing and iterating their skills in the beginning of a unit. A traditional gradebook punishes students with low grades during this process and makes in mathematically challenging improve the average enough for the final number to show that mastery has occurred. I can’t say this better than the PowerSchool Forum:

“Median” is a bit better than “Average” but they are closely related. Neither system would allow for students to show their most recent level of mastery and punishes students too much for early mistakes for my taste.

Mode – In imprecise mathematical terms “Mode” is the most frequent result in a given set of data points.

Example – In the above data set each student would have difficulty in the “Mode” setting since they have one of each score. According to my reading this would lead most spreadsheets to return the first value. So Student #1 would have score a “1” whereas Student #4 would score a “4”. If we alter the data set in a small way we would get different results. For example, lets say student 1 gets a 3 on assignment 2 then the chart would read 1, 3, 3, 4. This means the mode would be “3”.

Pros – In theory this would return a score that accurately reflects the level of mastery most often demonstrated by the student over the course of time.

Cons – This does not factor in formative assessment or, similar to the Average/Media problem, it may not reflect a score in the moment. If we are simply looking for the most numbers a student that struggles early in a unit but has a large growth spurt towards the end may be penalized for the amount of time it took to learn rather than whether the skill/content was mastered in the end. Also, if there is a small data set then a student may be penalized for having a large number of low scores with mixed higher scores. For example, consider the data set 1, 1, 1, 1, 2, 2, 3, 4, 4. I would read this as showing tremendous growth with a trend towards mastery; however the Mode mastery level calculation would suggests a level 1 understanding.

Decaying Average – In this Mastery Level Calculation Method a student’s most recently assessed items are weighted more heavily than older items. The most common calculation seems to decay the value of each older assignment by 2/3rds or about 66% each time a new item is added to the skill/or content category.

Example

Student #1 in our data set has 1, 2, 3, 4 The “1” is the earliest assessment and the “4” is the latest. After the first assessment the student would have a 1 overall. After adding the second the calculation formula would look this (1*.35)+(2*.65) = 1.65. If this were an Average it would have simply been (1+2)/2 = 1.5. When the next assessment is added the calculation becomes (1.65*.35)+(3*.65) = 2.5275. Had we gone with an average it would have been (1+2+3)/3 = 2. Add the final score in and we have (2.5275*.35)+(4*.65) = 3.484625. Without the decaying average it would be (1+2+3+4)/4 = 2.5.

Student #4 in our data set has 4, 3, 2, 1. After the first assessment the student would have a “4”. After the second the calculation would be (4*.35)+(3*.65) = 3.35. A straight average would give the student (4+3)/2 = 3.5. After the third assessment this student would have (3.35*.35)+(2*.65) = 2.4725. An average would give the student (4+3+2)/3 = 3. The final assessment would show (2.4725*.35)+(1*.65)= 1.515375

Pros – In a straight Average system (like most traditional gradebooks) both students would have ended this unit with a 2.5 grade; however the trend lines towards mastery are going in completely different directions. Student #1 is showing growth and in the moment is trending towards mastery of the skill whereas Student #4 is trending away from mastery and is doing fairly poorly in this standard in the latest moment. The Decaying average allows for a better understanding of the students current mastery level.

The goal of a standards based system is to allow for formative feedback. Students shouldn’t be penalized for not knowing information that has not been taught (as in a pre-assessment) or when they are still learning the information (as in an early quiz). Also, assessments are meant to be a tool to determine strength and weaknesses to be addressed in the teaching process. A low score in the beginning of a unit is not only likely but to be expected! Penalizing a student for this would not be fair to the student. It might also encourage cheating or “creative collaboration” which would result in poor data, poor feedback, and ultimately poor instruction.

The decaying average calculation would allow for early formative grades to wither in value and, in theory, allow for more recent grades to show growth patterns. Unlike some of the earlier calculations, like Highest Score or Mode, it does also track student’s whose skills have eroded.

Another pro is that while the calculation is somewhat complex it is still simple enough for a high school student to grasp. This may be an issue with the Power Law calculation we will see later.

Cons – Since the decaying average weights the most recent example so heavily there is potential that a single bad assessment could overly harm a students score. Lets go off our data set to examine a student that has the following – 2, 3, 4, 1. After the first 3 assessments the student would have a 3.5275 and would seem to be trending towards mastery but after the last assessment the score precipitously drops to 1.884625. Maybe that student was up all night because mom and dad were fighting and bomb the assessment or maybe they were upset because the cafeteria was having beef nuggets (yes, this is a thing). Should this one assessment be weighted so heavily? Even with a 3 on the next assessment the score would only rise to 2.60961875. This decaying average seems to allow for an outlier to have an oversized impact on the grade narrative.

Average of the 3 Most Recent – As the name implies this Mastery Level Calculation would mash up parts of the Average and Decaying Average Calculations. Rather than account for all of the assessments as in the Average or Median Models this calculation only looks at the final 3 inputs. This would “wither” the older scores out of the final calculation without the more complicated formula of the decaying average.

Example – Student #1 would have a score based on (2+3+4)/3 = 3. Student #2 would have (3+2+4)/3 = 3, Student #3 would have (4+1+3)/3 = 2.67. Student #4 would have (3+2+1)/3 = 2

Pros – This is a much more simplified version of the Decaying Average principle and the students would likely understand it more quickly. It is similar to the Average of the Highest 3 but does give a better idea of how the student is doing in the moment.

Cons – Much like the decaying average there does seem to be some pretty heavy statistical noise that can come from an outlier data point. Also 1 bad assessment would stay with a student for at least 3 more assessments (It would move from most recent, to second most, to third most, and finally be out of the set altogether. This also feels like too much of a snapshot in the moment. Yes, this entire post has harped on determining a students level of mastery in the moment but the 3 latest assignments seems fairly arbitrary as a cut off point.

Power Law Function – This function attempts to create a trendline or curve of the student’s grades based on the Power Law equation and predict the current Mastery Level based on all of the Data Points. This function attempts to remove much of the statistical noise created by a few outliers while also factoring in the body of the student’s work. I wrote about this some in my Power Law Possibility Post.

Example

Pros – I can’t say it much better than @science_goddess on her Excel for Educators blog,

” When new skills are introduced, there is a big gain in learning at the beginning. With repeated opportunities, learning still increases, but at a much slower rate.”

With this Mastery Level Calculation the earlier scores are factored in but they are used not as average calculator but rather as a plot point on a norm referenced curve (did I just write that!?). Ok, this means that the in the moment snapshot of mastery will be more predictive of the Mastery in the moment than a traditional gradebook average. It also means that as more data points are entered into the formula the more accurate the curve will become in predicting the mastery level. The math is definitely above my pay grade.

The good news is that my district pays for PowerSchool which has this Mastery Level Calculation Method as an option and, if admin won’t let me use PowerSchool, there are also a number of free online gradebooks that support this as well.

I really like this mini explainer from PowerSchool. The explainer shows three typical student types and shows how the graph and score would look for each.