ADM Transparency Readiness Diagnostic

Discovery Instrument — the questions, and how each is answered

The facilitated session we run to walk one citizen-facing automated decision through its seven surfaces (S1–S7). Each question is tagged by how its answer is established — confirmed with the client in the room, or verified in a technology audit — so the resulting readiness picture is evidence-backed, not self-attested.

StepInsight × Protiviti 7 surfaces · 29 questions Confirm + technology audit Draft — shows the shape of the instrument

What this session does

We walk one already-chosen decision end to end. At each surface we ask how it actually works today, reflect the answer back, and tie it to the obligations that bite there. The output is a single picture: a heatmap of where the decision is compliant, exposed, or unknown — with the three highest-priority gaps and a straight answer on readiness for the 10 December 2026 ADM transparency deadline.

Because the obligation mesh is the rubric, every stage is scored against the law itself — not opinion. That is what makes the result defensible, and it is a live preview of the continuous-assurance "eval loop" without having to build it.

Before this session — the decision is already chosen

This instrument assumes a decision has been selected and access granted

Choosing which decision, confirming the entity type (non-corporate Commonwealth / corporate / state), and securing access happen in a separate partner-scoping step with Rich beforehand. Those scoping questions are not run here — they gate whether we run this. This instrument is the diagnostic itself.

Who needs to be in the room

Role	Why we need them	Surfaces
Decision / process owner	Holds the end-to-end picture	S1–S7
Data / data-matching owner	The Robodebt zone lives here	S1, S2
Model / scoring owner	Explainability + fairness answers	S3, S7
Legally-authorised decision-maker	The accountability anchor	S4, S5
Caseworker / operational lead	What actually happens vs. the policy	S4, S5, S6

Setting up for a clean, honest session

Pre-session checklist (30 seconds, the facilitator's job)

Attendee full names confirmed (not email handles), with roles and agency-name spelling
The chosen decision named in one line; its known source systems listed (to reflect back cleanly)
The model / scoring type pre-loaded — even rules-based automation can be caught by APP 1.7–1.9
Obligation acronyms ready to expand once (APP, ART, ADM, OAIC) so they anchor in the transcript
Audio sorted: remote attendees on their own devices; no boardroom single-mic
Mindset: reflect back the hard words; name people as they speak; this is mapping, not auditing — rapport first

The opening move (warm, not a script)

"Before we dive in — so I don't misquote anyone later, a quick go-round: your name and what you do here. We're going to walk one decision end to end. At each step I'll ask how it actually works today, so we can lay it against the obligations that bite by the 10th of December. We're not auditing anyone today — we're mapping the surface so the gaps are obvious and fixable. Anything that's 'we're not sure' is exactly what we want to hear."

The walk — seven surfaces, the questions to ask

Confirm Established with the client in the room — governance, intent, ownership, awareness, plans. Audit Cannot be taken on trust — verified in a follow-on technology audit.

Risk & compliance note — how answers become evidence

Most technical-control questions below cannot be scored "compliant" on the client's say-so. Self-attestation is precisely the failure mode the Robodebt Royal Commission identified — a control "treated as correct by default." So we split the work: in the room we confirm governance, intent and ownership directly with the people accountable; everything tagged Audit is flagged for verification in a follow-on technology audit — inspecting the system, logs, model artefacts, test evidence and the actual citizen-facing notices. A stage only scores green once the audit evidence backs the answer; until then it is provisional. The tags below show which is which — and they double as the scope for that technology audit.

Intake & data captureWhere the citizen data comes from

G6G7G8

Questions to ask the client

Where does the citizen data feeding this decision come from — how many systems and third-party sources? ConfirmAudit
Is the lawful basis and consent recorded at field level, or assumed? Audit
Is the data trimmed to only what the decision actually needs? Audit
What security baseline applies — PSPF, Essential Eight, certified hosting — and can the current assessment evidence be produced? Audit

Reflect back

Each source system by name; the security-baseline acronym expanded once; any count of systems repeated.

Audit verifies

Field-level provenance in the data schema · fields ingested vs. used · current PSPF/Essential Eight/IRAP assessment evidence.

Data matching — the Robodebt zoneWhere averaging and proxy bias hide

G4G5G9

⚑ Heavy-load surface — protect this if time runs short

Questions to ask the client

Is data matched, averaged or integrated across systems to build the case picture — anything resembling income averaging? ConfirmAudit
Is the matching logic documented and tested, or treated as correct by default? Audit
Are low-confidence matches flagged for a human, or do they flow straight through? Audit
Have the matching assumptions been screened for indirect discrimination through proxy variables? Audit

Reflect back

The phrase "income averaging" if it surfaces — it's the Robodebt failure point. Reflect any threshold/confidence number; name the proxy variables mentioned (postcode, age band).

Audit verifies

The actual matching logic + test evidence · the human-referral threshold in the system · any bias / proxy-variable screening artefacts.

AI scoring / triageWhether a flag can be explained

G2G9G10

Questions to ask the client

Can you reconstruct, in plain language, why a given person was flagged or scored? Audit
Does each output carry a reason code a caseworker — and a tribunal — could read? Audit
Is the model a black box, or is explainability built in? Audit
Has fairness been tested across the affected population against documented thresholds? Audit

Reflect back

The model / tool name; whether outputs carry a reason code; the fairness-testing cadence if named.

Audit verifies

Explainability tested on real cases · reason-code content in actual outputs · model architecture · documented fairness thresholds + results.

The human decision — the SeamDecide, or rubber-stamp?

G2G4G10

⚑ Heavy-load surface — the accountability anchor

Questions to ask the client

In practice, does the human meaningfully decide, or effectively rubber-stamp the AI output? ConfirmAudit
Are there defined review gates before a decision is finalised? Confirm
When a human overrides the AI, is the override and its reason logged? Audit
Under what documented legal authority and delegation is the final decision made? Confirm
Is that authorised decision-maker the same person who actually reviews the AI output day to day? Confirm

Reflect back

The named decision-maker's role; the distinction between deciding and rubber-stamping ("so the caseworker can override, but most go through as scored — got it"); whether overrides are logged.

Audit verifies

The override / acceptance rate and time-on-task in the logs — the hard evidence of whether oversight is meaningful or nominal — and that override reasons are actually captured.

Notification, reasons & reviewThe 10 December 2026 deadline

G1G4G10

⚑ Heavy-load surface — the deadline lands here

Questions to ask the client

Today, is AI involvement disclosed to the citizen — in the privacy policy and at the point of decision? Audit
Are the reasons given adequate enough to support a merits review at the ART? ConfirmAudit
Is the route to internal review and the ART clear to the citizen? Audit
Does the client know whether this decision is caught by APP 1.7–1.9, and is there a plan to be compliant by 10 December 2026? Confirm

Reflect back

APP 1.7–1.9 expanded once ("the new ADM transparency rule"); ART expanded once ("the Administrative Review Tribunal"); the 10 Dec 2026 date repeated; disclosure live-today vs. planned.

Audit verifies

The actual privacy collection notice + decision letters (the documents, not the description) · whether reasons in a real sample would survive ART scrutiny · review-route wording.

Logging & provenanceCan you reconstruct a past decision?

G4G7G8

Questions to ask the client

Could you reproduce exactly what data and model version produced a specific decision made months ago? Audit
Is the model version pinned and logged against each decision? Audit
Are the complete decision records kept in an Archives-compliant way, or scattered across operational logs? Audit

Reflect back

The phrase "model version pinned"; where records actually live ("so it's in the case-management system and the model logs — two places").

Audit verifies

An actual reconstruction attempt on one past decision · model-version pinning in the logs · whether the full record set meets Archives Act retention.

Monitoring & assuranceProtiviti's lane — with an AI lens

G2G3G9G10G11

Questions to ask the client

How is the model currently monitored — continuously, or checked when someone remembers? ConfirmAudit
Is drift and fairness re-tested between audits, or only point-in-time? Audit
Does a rule or policy change — like the Dec-2026 date — trigger a re-assessment today? Confirm
Is the assurance sample-based and manual, or moving toward continuous and evidence-rich? ConfirmAudit
If the model or platform is vendor-supplied, do the contracts carry AI accountability terms (e.g. DTA AI Model Clauses) and audit / access rights? Confirm

Reflect back

The monitoring cadence ("reviewed annually unless something breaks — got it"); the point-in-time vs. continuous distinction — where Protiviti's value and the continuous-assurance product live (APRA, April 2026: point-in-time AI assurance is "no longer fit for purpose").

Audit verifies

The monitoring configuration + drift/fairness re-test artefacts · sample-vs-continuous assurance evidence · the supplier contract terms (Q5 — a document review).

Closing the session — soft check

2–3 minute recap. "Here's what I heard: the decision is [X]; the heaviest exposure looks like S2 / S4 / S5; the systems are [A, B, C]; the thing that surprised me was [Y]." (Reflecting the picture back is itself a clean second capture.)
The check. "Any red flags from that? Anything that doesn't sound like how it actually works for you? Anything important we didn't get to?"
Sit with the silence. The first five seconds of quiet is where the real corrections come out.
Name the audit scope. "A handful of these we can only confirm by looking at the system itself — the logs, the model, the actual notices. I'll send a short list of what we'd need access to." (That list = every Audit question above.)
What happens next. "From this we build the readiness heatmap — every stage scored against the obligations, the three priority gaps named, and a straight answer on readiness by 10 December. That's what goes in front of Lauren and Rita."

Coverage: the 29 questions map to the brief's diagnostic sections C–I, the accountability probe at S4 (brief B-Q9), and a procurement / supplier-accountability question added at S7 (obligation G11 — DTA AI Model Clauses), which the brief's mesh lists but the first draft omitted. Use-case selection and access (brief A, B, J) sit in the separate partner-scoping step and are not run here.

Evidence model: Confirm = established with the client in the room. Audit = verified in a follow-on technology audit before the stage can score compliant. Companion artefact: the readiness heatmap, PVT_adm_readiness-heatmap-mock.html.

Status flags current as at June 2026. Orientation only, not legal advice — confirm currency before any external use. Draft artefact showing the shape of the instrument; not a deployed engagement.