Capability Brief · WCAG 2.1 AA · Section 508

Accessibility, in plain sight

Instructors upload a file, see what's wrong in plain language, and apply fixes selectively — with a preview that shows time, cost, and score impact before anything runs.

One module of the Agoge Platform. This page covers what it does, how it decides what to fix, and the test backstop behind it.

File formats

39 / 39

Unit tests passing

9 / 9

Reference fixtures clean

+88

Highest score gain
(bad → remediated)

Higher-ed course materials are systematically inaccessible to disabled students — and most instructors don't have time to learn assistive-tech testing. This module replaces a 30-to-60-minute manual review with a guided one-to-three-click flow, surfacing concrete WCAG issues alongside their suggested fixes and (where safe to do so) applying them automatically.

How the module works

A file moves through the pipeline below. Two scans run in parallel — programmatic (WCAG checks against the file's structure) and AI semantic (catches recall gaps the programmatic scan can't reliably find). The results merge, get costed, and the user decides what to apply.

End-to-end flow: upload → dual-scan → cost estimate → gate decision → remediation.

When does the user see a preview?

For trivial fixes — set a missing language attribute, add a main landmark, generate one alt text — the work is fast and deterministic. The platform applies them straight through, one click.

For anything heavier, the user sees a screen showing exactly what's about to happen and can opt out per fix. Three things trigger the preview:

Total time over 30 seconds — OCR pipelines, multi-AI-call alt-text generation
Total cost over $0.05 — large batches of paid-AI calls
Approximate output anywhere — AI-generated content, OCR, anything that needs human verification

A per-user toggle ("always show me what'll be fixed") lets cautious instructors flip the gate to permanently-on.

What gets detected

Programmatic checks plus an AI semantic pass that hunts six specific recall gaps the programmatic scan can't reliably find.

Format	Programmatic checks	Auto-remediable
HTML	language declaration · page title · alt text · empty-alt heuristic · color-only signaling · heading hierarchy · form labels · link text · main landmark · duplicate IDs	lang · title · alt text · main landmark
PPTX	slide titles · alt text (incl. filename-placeholder detection) · table headers · minimum font size · embedded media captions	slide titles · alt text
DOCX	alt text (incl. placeholder detection) · heading hierarchy · table headers · hyperlink text quality	alt text
PDF	document title · language · tagged structure · image-only detection · bookmark presence	title · /Lang · OCR for image-only pages

AI semantic pass targets: color-only signaling, inadequate alt text, fake headings, reading-order problems, placeholder-as-label, and any other clear WCAG/508 violation visible in the content excerpt.

Test backstop

Three layers of regression protection. Every code change runs the unit suite locally; the fixture verifier catches scanner-output drift; the round-trip drivers exercise the live AI providers and the OCR pipeline.

Recent delta highlights

Scenario	Before	After	Note
HTML "bad" tier, single AI revision cycle	9 / 100	97 / 100	+88 points across 11 issues
Scanned PDF fixture (image-only)	62 / 100	85 / 100	OCR recovered ~539 chars; image-only finding cleared
Reportlab untitled PDF (metadata only)	66 / 100	82 / 100	Title + /Lang fixes applied
HTML "medium" tier programmatic findings	1 issue	3 issues	New empty-alt and color-only heuristics surfaced gaps

What's shipped, what's next

Capability	Status	Detail
HTML / PPTX / DOCX scanning	Shipped	10+ WCAG categories per format
PDF scanning	Shipped	Title · /Lang · tagging · bookmarks · image-only detection
AI semantic scan	Shipped	Tuned for six named recall gaps
Auto-remediation (alt text, lang, titles)	Shipped	Per-format remediators with user control over which fixes apply
OCR for image-only PDFs	Shipped	PyMuPDF render @ 300 DPI + Tesseract; ~5s/page; produces a searchable text layer
Preview workflow (per-fix checkboxes + cost estimates)	Shipped	Gates on >30s, >$0.05, or approximate output; per-user toggle
"Previously declined" tracking	Shipped	Re-scans of the same file content remember what the user chose to skip
Analytics logging (per-user, per-course, per-org)	Shipped	FERPA-safe event row per remediation; `GET /accessibility/analytics/{my,course/{id},org}`
Org accessibility-manager dashboard	Shipped	KPI cards, fix-type bar chart, recent events list. Gated by ORG_ADMIN role on the catalog and the route.
PDF structure-tree tagging (best-effort)	Shipped	`/MarkInfo` + `/StructTreeRoot` placeholder via pikepdf. Clears the scanner's "PDF is not tagged" finding. Not a substitute for properly authored tagging; marked "approximate" so users see it in the preview gate.