5.7 Scaled Scoring

Understanding Scaled Score in EdisonOS Practice Tests

Learn how scaled score works in EdisonOS, how it differs from raw score, and what you can configure on a per-test basis to align practice tests with official SAT, ACT, and custom score scales.

What is a scaled score?

A scaled score is a transformed version of a student's raw performance that lives on a standard, externally meaningful scale, such as the SAT (400–1600) or ACT (1–36). It adjusts for things like question difficulty, adaptive branching, and how different sections are weighted, so that scores from different practice tests stay comparable.

In EdisonOS, scaled score is optional and practice-test specific:

Only practice tests can enable scaled score.
Problem sets never use scaled score.
If a practice test does not have scaled score enabled, the report will simply show a raw score.

Raw score first, scaled score second

EdisonOS always grades every question and produces raw scores first. This includes per-question correctness, raw points per question, and raw totals across sections, modules, and the full test.

Raw grading is the source of truth. Scaled score is always a derived view sitting on top of the finalized report. This means:

Scaled score never replaces raw score.
If anything goes wrong with scaled-score calculation, the raw report still generates correctly.
Customers always have a complete raw-score view to fall back on.

How scaled score is structured

Scaled score in EdisonOS is computed in three layers:

Unit - A scoring bucket that gets its own scaled score. Examples include SAT Reading and Writing, SAT Math, or ACT English / Math / Reading / Science. You can also define custom units for bespoke tests.

Parts - The concrete sections or modules that feed into a unit. For example, the SAT Math unit may have parts for Module 1, Module 2 Easy, and Module 2 Hard. An ACT English unit may have multiple passages all rolling up into a single unit.

Total - An optional overall scaled score for the test, aggregated from units in one of two ways:

SUM (SAT-style): Keep native unit scales and sum them. For example, 200–800 + 200–800 = 400–1600.
AVERAGE (ACT-style): Average unit scores so the total stays on the same 1–36 scale.

Each unit can be either included or excluded from the total. Excluded units still get their own scaled score in the report they just don't roll up into the composite. For example, ACT Science is excluded from the composite by default, but you can include it if a specific practice test calls for it.

Field questions never affect scaled score

Field questions (questions marked as experimental or unscored) are explicitly excluded from both scaled score and difficulty-weighted sums used for scaling. They do not contribute to the numerator or denominator in any formula, regardless of which strategy your test uses.

SAT-style scoring (Weighted Mean)

SAT-style tests in EdisonOS use a weighted mean of question difficulty for each module. This means harder questions carry more weight, and a student's score reflects how much of the available difficulty they converted into correct answers.

Default ranges for SAT-style tests:

Reading and Writing: 200–800
Math: 200–800
Total (sum of both): 400–1600
Step size: typically 10 (so scores like 410, 420, 430…)

How it works for each module:

Each scored question carries a numeric difficulty score.
Correct, non-field questions contribute their difficulty score to a "scored weight."
Incorrect or skipped questions contribute zero.
The weighted mean for that module is scored weight ÷ total possible weight.
That weighted mean is multiplied by the module's configured maximum contribution to give its scaled contribution.
All module contributions are added to the unit minimum to produce the unit score.

Low-band adjustment (for adaptive paths):

When a student receives an easier second module on an adaptive test, EdisonOS can apply an optional penalty if their performance on the easy module is lower than their performance on the baseline module. This keeps scores comparable between students who took different adaptive paths and matches the behavior of official adaptive SAT scoring.

ACT-style scoring (Mapped Score)

ACT-style tests use official-style lookup tables that map raw correct counts directly to scaled scores. Each raw score maps to a specific scaled score based on the table.

Default ranges for ACT-style tests:

Each unit (English, Math, Reading, Science): 1–36
Composite (average of included units): 1–36
Step size: typically 1

How it works for each unit:

EdisonOS counts the number of correct, non-field questions across all parts of the unit.
That raw count is used as the key into the unit's mapping table.
The mapping table returns the scaled score for that unit.

Default vs. custom mappings:

EdisonOS ships with default ACT mappings for standard English, Math, Reading, and Science modules. For each unit, you can choose to use the default mapping, or attach a custom mapping table to match a specific released form or publisher-specific curve.

If a mapping entry is missing for a realized raw score, EdisonOS treats it as a configuration error rather than guessing so you'll know exactly what needs to be fixed.

Bias, rounding, and clamping

After the unbiased unit score is calculated, EdisonOS applies a few finishing steps to produce the final score on the report.

Unit-level bias: Each unit can carry an optional bias value (positive or negative). This is useful for small calibration tweaks for example, +10 SAT points or +1 ACT point to align practice tests with official released tests without changing underlying weights or mappings. Bias is only applied when the unbiased score is strictly inside the unit's min/max range.

Step size: Each configuration defines a step size (10 for SAT, 1 for ACT). The unit score is rounded to the nearest step.

Clamping: After rounding, the score is clamped back into the configured [min, max] range so it never exceeds the scale.

Configuration is per practice test

Scaled score is enabled and configured per practice test. The settings sealed into the practice test at the time it was finalized are what drive reporting not program-level defaults.

This gives you a few important capabilities:

You can adjust units, weights, mappings, ranges, and calibration on a per-test basis without affecting other tests in your program.
Program-level defaults make it easier to create new practice tests with consistent behavior, but they don't retroactively rewrite scaled scores on existing reports.
When rolling out a new scaled-score policy, you can decide whether to update defaults only for future tests, or migrate existing practice-test configurations if you need retroactive behavior.

The only things we allow you to customise are Score Capping and Bias.

Score Capping sets a unit's floor and ceiling, the score is clamped into the [min, max] range (e.g., 200–800 for SAT, 1–36 for ACT).

Bias is a small +/– adjustment to align a test with an official form (e.g., +10 SAT, +1 ACT), skipped if the score is already at the floor or ceiling.

For ACT, edit the score mapping table to match a specific form. Cover every raw score from 0 to the max scored questions; a missing entry is a configuration error.

Mapping tables (ACT)

Use default ACT mappings out of the box, or attach custom tables per unit to match specific forms or publisher curves.

What happens when scaled score is misconfigured

EdisonOS treats incorrect or incomplete scaled-score configuration as a configuration issue, not a grading issue. If scaled score fails at runtime for example, because of missing difficulty scores, invalid mappings, or inconsistent structure EdisonOS will:

Still successfully generate the raw report.
Mark the scaled-score field as errored rather than returning a guessed or partial value.

This means reports always show valid raw scores, and you'll get a clear signal whenever scaled-score configuration needs attention. When investigating unexpected scaled scores, the right starting point is to validate configuration and inputs.

Quick summary

Raw score first, scaled score second. EdisonOS always grades every question and stores raw scores; scaled score is a deterministic transformation on top of that.
Field questions never affect scaled score. They are used only in raw counts and explicitly excluded from both scaled score and difficulty-weighted sums.
SAT-style tests use difficulty-weighted modules. Harder questions carry more weight; the score reflects how much of the available difficulty was converted into correct answers.
ACT-style tests use official-style lookup tables. Each raw score maps to a scaled score, with support for both default and custom mappings.
If scaled score is misconfigured, EdisonOS fails loudly. Reports still show raw scores, and scaled score is marked as errored instead of silently returning misleading values.

FAQs

The following FAQs cover the most common questions admins, tutors, and content teams ask once they start working with scaled scores in EdisonOS.

Why does my student see a raw score on one practice test and a scaled score on another?

A - Because scaled score is enabled per practice test. If a test doesn't have scaled scoring turned on in its settings, the report will only show the raw score. If you want a particular test to display a scaled score, open its settings and enable scaled scoring with the right strategy (SAT-style or ACT-style).

Do problem sets have scaled scores?

A - No. Scaled scores only exist for practice tests. Problem sets always show raw performance correct, incorrect, skipped, accuracy but never a scaled score.

My student's raw and scaled scores look out of sync. Which one should I trust?

A - Always trust the raw score as the source of truth. The scaled score is a derived view on top of the raw score and exists to make practice scores comparable to official test scales. If the two seem inconsistent, the issue is almost always in the scaled-score configuration (e.g., a mapping table or difficulty weight), not in the grading itself.

What's the difference between raw score, scaled score, and percentile?

A - Raw score is the count of questions answered correctly. Scaled score is that raw performance translated onto a standard scale (like SAT 400–1600 or ACT 1–36) using either a weighted mean (SAT) or a lookup table (ACT). Percentile is a comparison to other test-takers EdisonOS does not currently calculate percentile rankings, only raw and scaled. If you need percentile context, refer to the official College Board or ACT percentile tables for the most recent test administration.

Two students answered the same number of questions correctly. Why are their scaled scores different?

A - This usually happens for one of three reasons:

They got different questions right. On SAT-style tests, harder questions carry more weight. A student who got 30 hard questions right will score higher than a student who got 30 easy questions right, even though the raw count is the same.
They took different adaptive paths. On adaptive SAT-style tests, one student may have routed into Module 2 Hard and the other into Module 2 Easy. The scoring uses different module weights for each path, and the low-band adjustment may apply to the easier path.
Different field questions appeared in their attempts. Field questions count toward raw score but are excluded from scaled-score calculation. So if two students have the same raw count but the split between scored and field questions is different, their scaled scores can diverge.

How is the SAT-style scaled score actually computed?

A - For each module, EdisonOS:

Sums the difficulty scores of every question the student answered correctly (this is the "scored weight").
Sums the difficulty scores of every non-field question in the module (the "total possible weight").
Divides the two to get a weighted mean basically, the percentage of available difficulty the student converted.
Multiplies that weighted mean by the module's configured maximum contribution.
Adds the contributions from every module in the unit, then adds the unit's minimum score to get the unbiased unit score.
Applies bias, rounds to the step size, and clamps into the allowed range to produce the final scaled score.

How is the ACT-style scaled score actually computed?

A - Much simpler. EdisonOS counts the number of correct, non-field questions in the unit, then looks that raw count up in a mapping table to get the scaled score. There's no weighting and no adaptive logic. just raw count → scaled score.

Why is a 0/0 module showing as 0% instead of being skipped?

A - If a module has no scored, non-field questions (for example, due to misconfiguration), its weighted mean defaults to 0 rather than being undefined. This is by design as it prevents the calculation from failing entirely. If you see a 0/0 anywhere, treat it as a configuration issue and open the practice test settings to investigate.

Why scores look the way they do

My student got 95% of questions correct but the scaled score is only 1450. Shouldn't it be near 1600?

A - Not necessarily, and this is one of the most misunderstood parts of standardized scoring. Scaled scores are calibrated against the difficulty distribution of questions on the official test, not as a simple percentage of correct answers. Hitting 1600 typically requires near-perfect performance on hard questions specifically, not just a high overall accuracy. Missing a few hard questions can cost more points than missing many easy ones, because hard questions carry more difficulty weight.

My student went from 1180 to 1200 even though they got more questions right. Why didn't the score jump more?

A - Two reasons this commonly happens:

Step size. SAT-style scores round to the nearest 10. A real improvement of, say, 8 points may not show up because the rounded score didn't cross the next step.
Where the gains came from. If the additional correct answers were on easy questions, they contributed less weight than a single hard-question gain would have. Scaled score isn't a linear function of raw count.

To unlock larger jumps, focus on converting medium and hard questions, not just adding more easy correct answers.

My student's scaled score went down even though they got more questions right than last time. How is that possible?

A - This is rare but possible. Common causes:

Easier questions got right, harder questions got wrong. The student improved on easy questions but lost ground on harder ones, lowering the weighted mean despite a higher raw count.
Different adaptive path. If they routed into Module 2 Easy this time and Module 2 Hard last time, the scoring caps and weights are different, and the low-band adjustment may apply.
Different test form. If you're comparing across two different practice tests with different difficulty distributions or different scaled-score configurations, the scales aren't perfectly equivalent.

The fix is usually to look at the raw breakdown by difficulty (Easy / Medium / Hard) on the Detailed Breakdown tab to see where the gains and losses actually happened.

Two practice tests show very different scaled scores even though my student felt they performed similarly. What's going on?

A - Each practice test has its own scaled-score configuration sealed at the time it was finalized. Two tests can have different difficulty distributions, different module weights, different bias values, or even different mapping tables. Even with identical raw performance, two tests can produce different scaled scores. Use the trend across multiple tests, not the gap between any two specific tests, to assess progress.

Why is the scaled score showing 410 when the minimum is 200?

A - Because 410 is rounded to the nearest step (10 for SAT) within the 200–800 range. The minimum a student can score is the unit's minimum (usually 200 for SAT). If you're seeing a floor that doesn't match what you expect, check the unit's minimum score and the rounding step size in the practice test's scaled-score configuration.

Why did my student's score not improve much even though they answered more questions correctly?

A - SAT-style scoring uses a weighted mean of question difficulty, not a simple count of correct answers. If a student answered more easy questions correctly but missed harder ones, their weighted score may not move much, because each correct answer is worth its question's difficulty weight. To improve the scaled score meaningfully, students need to convert harder questions, not just rack up easier ones.

What is the low-band adjustment and when does it kick in?

A - The low-band adjustment is a penalty that can apply when a student takes the easier path on an adaptive SAT-style test (i.e., they got routed to Module 2 Easy instead of Module 2 Hard) and underperforms on the easy module relative to the baseline module. It keeps scores comparable between students who took different adaptive paths. The adjustment only applies if it's explicitly configured for the unit.

Does the low-band adjustment apply to every adaptive SAT test?

A - No, it only applies if it's been explicitly configured for the unit, with a baseline part, an easy part, and a penalty per raw point. Without that configuration, EdisonOS won't apply any cross-path penalty, and the easy path will simply use its own scoring contribution as configured.

Can I change a question's difficulty weight after a test has been taken?

A - You can change the difficulty weight, but it won't retroactively rewrite existing reports. Already-finalized reports use the difficulty weights that were sealed at the time of grading. Updated weights will only affect new attempts going forward.

My SAT-style test caps at 1500 instead of 1600. Why?

A - Check the unit-level maximum score for both Reading & Writing and Math — if either is configured below 800, the total will cap below 1600. Also check the per-module maximum contribution values: if the sum of those contributions across modules in a unit doesn't reach (unit max − unit min), the unit score can never reach the configured maximum.

What happens if my mapping table is missing a value?

A - EdisonOS will treat it as a configuration error and mark the scaled score as errored on that report. The raw report will still generate normally, but the scaled-score field will fail rather than guess. Make sure your mapping covers every possible raw score from 0 up to the maximum number of scored, non-field questions in the unit.

Can I use a custom mapping table for one unit and the default for another?

A - Yes. Mapping is configured per unit. You can mix and match for example, use the default ACT English mapping while attaching a custom mapping for ACT Math to match a specific released form.

Why is ACT Science excluded from the composite by default?

A - To match the standard ACT composite, which is the average of English, Math, and Reading. If you want Science to count toward the composite for a specific test, you can override that behavior in the unit's configuration by including it in the total.

My ACT composite came out as 24.3 but the scaled score shows 24. Where did the 0.3 go?

A - ACT scores round to the nearest whole number (step size of 1). The composite is calculated as the average of included unit scores, so a 24.3 unbiased average rounds down to 24 on the report. The unrounded value is preserved internally for audit purposes but the displayed score follows the configured step size.

What exactly is a "field question"?

A - A field question is a question marked as experimental or unscored for the purposes of scaled scoring. They're typically used to pilot new questions before they're calibrated into live scoring, or to collect performance data on questions you're considering for future tests. Field questions still count toward raw score (correct/incorrect/skipped on the report), but they're excluded from scaled-score calculation they don't contribute difficulty weight on SAT-style tests or count toward the lookup on ACT-style tests.

Will the student know which questions are field questions?

A - No. Field questions look identical to scored questions during the test. The student answers them the same way, and they're presented in the normal flow. The "field" status is purely a back-end flag that controls scoring.

If a field question is excluded, does the test still run for the configured time?

A - Yes. Timing is independent of scoring. A 27-question Reading & Writing module that contains 25 scored questions and 2 field questions still runs for the full configured duration. The field questions take up real time on the test even though they don't contribute to scoring.

Field questions seem to be affecting my student's scaled score. Is that possible?

A - No, field questions are explicitly excluded from the difficulty-weighted sums (SAT-style) and the lookup count (ACT-style) used for scaling. They do still appear in the raw breakdown on the report (correct/incorrect/skipped), so a student who got a field question right will see it in the raw view but it won't move the scaled score in either direction. If the scaled score still seems off, double-check that the questions you think are field questions are actually flagged as field questions in the question metadata. A common mistake is forgetting to flag a question as a field question after it's been added to a test.

I updated my program-level scaled-score defaults. Why didn't my old reports change?

A - Program-level defaults are only used when creating new practice tests. Once a test is built, its scaled-score settings are sealed at the practice-test level that's what drives every report from that test going forward. To change scoring on an existing test, edit the test's own scaled-score configuration directly, or rebuild the test from the new defaults.

Can I have different scaled-score settings for two practice tests in the same program?

A - Yes. Each practice test owns its own scaled-score configuration. This is intentional, it lets you run a calibrated official-style mock alongside a custom diagnostic in the same program without one affecting the other.

What is bias and when should I use it?

A - Bias is a small, optional adjustment (positive or negative) added to a unit's score after it's calculated but before rounding. It's most often used to nudge a practice test's scale to match an official released form for example, +10 SAT points or +1 ACT point without rewiring underlying weights or mappings. Bias is automatically skipped if the unbiased score is already at the floor or ceiling of the unit's range.

Why was my bias not applied? My score should have been 1290 but it's showing as 1280.

A - Bias is intentionally skipped when the unbiased score is at the unit's floor or ceiling. This avoids "double-clamping" where bias would push a score outside the allowed range only to have it clamped back. If you expected bias to apply but didn't see it, check whether the unbiased unit score landed at exactly the unit's minimum or maximum.

I want a custom test with a 0–100 scaled score instead of SAT or ACT scales. Is that possible?

A - Yes. EdisonOS supports custom units with any min/max range and any step size. Define the unit with a 0 minimum, a 100 maximum, and pick the strategy (Weighted Mean or Mapped Score) that fits how you want to score it. This is how publisher-specific or bespoke tests get built on the platform.

My student's report shows a scaled-score error. What does that mean?

A - It means EdisonOS detected a configuration issue while computing the scaled score (most often a missing mapping value, a missing difficulty score, or an inconsistent structure between the test and its scaled-score settings). The raw report is still valid — only the scaled-score field is errored. Open the practice test's configuration to identify and fix the issue, then new attempts will compute scaled scores correctly.

My student got a perfect raw score but the scaled score isn't at the maximum. Why?

A - Check whether bias is configured for that unit and whether the unit's max score matches what you expect. Also confirm there are no field questions skewing the count — a perfect raw across non-field questions will hit the unit's max, but if the configuration sets the max lower than the standard scale, that's where the cap is coming from.

Why are practice scaled scores sometimes a little different from official scores?

A - Because no practice scale can perfectly replicate an official one, the official score curves are calibrated against thousands of real test-takers. EdisonOS scaled scores are tuned to match official forms as closely as possible (using bias, mapping tables, and difficulty weights), but small differences are expected. The trend across multiple practice tests is more meaningful than the exact number on any one test.

A parent is asking why their child's scaled score went down between tests. How do I explain that?

A - Frame it around the difference between practice tests, not student regression:

"Scaled scores aren't a fixed measurement of ability they're a snapshot of how a student performed on a specific test, with that test's specific difficulty mix and adaptive path."
"We expect scores to fluctuate between practice tests, especially early in prep. What matters is the trend over 4–6 tests, not the score on any single one."
"If we look at the difficulty breakdown, we can usually see exactly where the score moved and turn that into a focused plan for the next session."