Capability Brief · WCAG 2.1 AA · Section 508

Accessibility, in plain sight

Instructors upload a file, see what's wrong in plain language, and apply fixes selectively — with a preview that shows time, cost, and score impact before anything runs.

One module of the Agoge Platform. This page covers what it does, how it decides what to fix, and the test backstop behind it.

4
File formats
39 / 39
Unit tests passing
9 / 9
Reference fixtures clean
+88
Highest score gain
(bad → remediated)

Higher-ed course materials are systematically inaccessible to disabled students — and most instructors don't have time to learn assistive-tech testing. This module replaces a 30-to-60-minute manual review with a guided one-to-three-click flow, surfacing concrete WCAG issues alongside their suggested fixes and (where safe to do so) applying them automatically.

How the module works

A file moves through the pipeline below. Two scans run in parallel — programmatic (WCAG checks against the file's structure) and AI semantic (catches recall gaps the programmatic scan can't reliably find). The results merge, get costed, and the user decides what to apply.

Uploaded File PPTX · DOCX · HTML · PDF Programmatic Scanner WCAG 2.1 · Section 508 AI Semantic Scan Color-only · fake headings · etc. Issues + Score 0-100 decorated with prior declines Cost Estimator per-fix time · $ · score Δ Gate slow · costly · approximate? Apply All cheap deterministic case Preview Modal user ticks fixes to apply No Yes Remediator → Fixed File per-type · declined logged
End-to-end flow: upload → dual-scan → cost estimate → gate decision → remediation.

When does the user see a preview?

For trivial fixes — set a missing language attribute, add a main landmark, generate one alt text — the work is fast and deterministic. The platform applies them straight through, one click.

For anything heavier, the user sees a screen showing exactly what's about to happen and can opt out per fix. Three things trigger the preview:

  • Total time over 30 seconds — OCR pipelines, multi-AI-call alt-text generation
  • Total cost over $0.05 — large batches of paid-AI calls
  • Approximate output anywhere — AI-generated content, OCR, anything that needs human verification

A per-user toggle ("always show me what'll be fixed") lets cautious instructors flip the gate to permanently-on.

Total time > 30 s? OCR · multiple AI calls Total cost > $0.05? many AI calls Any approximate output? AI-generated · OCR OR · user toggle: "always show" Skip preview Apply every auto-fixable issue (legacy one-click behavior) Show preview modal Per-fix checkboxes · estimates Apply only what user ticks none of the above any condition true

What gets detected

Programmatic checks plus an AI semantic pass that hunts six specific recall gaps the programmatic scan can't reliably find.

FormatProgrammatic checksAuto-remediable
HTML language declaration · page title · alt text · empty-alt heuristic · color-only signaling · heading hierarchy · form labels · link text · main landmark · duplicate IDs lang · title · alt text · main landmark
PPTX slide titles · alt text (incl. filename-placeholder detection) · table headers · minimum font size · embedded media captions slide titles · alt text
DOCX alt text (incl. placeholder detection) · heading hierarchy · table headers · hyperlink text quality alt text
PDF document title · language · tagged structure · image-only detection · bookmark presence title · /Lang · OCR for image-only pages

AI semantic pass targets: color-only signaling, inadequate alt text, fake headings, reading-order problems, placeholder-as-label, and any other clear WCAG/508 violation visible in the content excerpt.

Test backstop

Three layers of regression protection. Every code change runs the unit suite locally; the fixture verifier catches scanner-output drift; the round-trip drivers exercise the live AI providers and the OCR pipeline.

Round-trip 4 driver scripts Reference fixtures 9 hand-built · verifier-checked documented expected output Unit tests 39 across scanner · estimator remediator · API · per-format helpers slow · live AI stable fast · isolated

Recent delta highlights

ScenarioBeforeAfterNote
HTML "bad" tier, single AI revision cycle 9 / 100 97 / 100 +88 points across 11 issues
Scanned PDF fixture (image-only) 62 / 100 85 / 100 OCR recovered ~539 chars; image-only finding cleared
Reportlab untitled PDF (metadata only) 66 / 100 82 / 100 Title + /Lang fixes applied
HTML "medium" tier programmatic findings 1 issue 3 issues New empty-alt and color-only heuristics surfaced gaps

What's shipped, what's next

CapabilityStatusDetail
HTML / PPTX / DOCX scanningShipped10+ WCAG categories per format
PDF scanningShippedTitle · /Lang · tagging · bookmarks · image-only detection
AI semantic scanShippedTuned for six named recall gaps
Auto-remediation (alt text, lang, titles)ShippedPer-format remediators with user control over which fixes apply
OCR for image-only PDFsShippedPyMuPDF render @ 300 DPI + Tesseract; ~5s/page; produces a searchable text layer
Preview workflow (per-fix checkboxes + cost estimates)ShippedGates on >30s, >$0.05, or approximate output; per-user toggle
"Previously declined" trackingShippedRe-scans of the same file content remember what the user chose to skip
Analytics logging (per-user, per-course, per-org)ShippedFERPA-safe event row per remediation; GET /accessibility/analytics/{my,course/{id},org}
Org accessibility-manager dashboardShippedKPI cards, fix-type bar chart, recent events list. Gated by ORG_ADMIN role on the catalog and the route.
PDF structure-tree tagging (best-effort)Shipped/MarkInfo + /StructTreeRoot placeholder via pikepdf. Clears the scanner's "PDF is not tagged" finding. Not a substitute for properly authored tagging; marked "approximate" so users see it in the preview gate.