Foreseeable Misuse mini-pack

This pack helps you make systematic decisions about risky scenarios: which to accept (with guardrails), which to defer, and which to decline.

Version: v8 (conceptually aligned to full record v378) Cluster: Foreseeable Misuse · Scenarios, controls, and acceptance decisions

Scenario–evidence drift

Tie each scenario back to named upstream sources: RSP v2.2, ASL-3 docs, public commitments, or evaluation artefacts. If a scenario still quotes or relies on superseded material, flag it and update before using it in negotiations or governance.

Do: label scenarios with the RSP / ASL-3 version or report ID they depend on.

Over-reliance on disclaimers

Treat disclaimers as one tool, not the default answer. Some scenarios belong in a Decline or Defer lane, even if a disclaimer could technically be drafted.

Check: for each “Accept with disclaimer”, note in the acceptance table why stronger measures (pause, decline, kill-switch) were not used.

Welfare inference overreach

Do not treat current welfare or evaluation results as proof of sentience, consciousness, or long-term wellbeing. Keep welfare questions routed through the Model Evals & Welfare handout and its uncertainty notes.

Link: when welfare concerns arise, cross-refer to the Model Evals & Welfare handout and underlying report, not to scenario narratives alone.

Jurisdiction & sector blind spots

Foreseeable misuse plays out differently in finance, health, critical infrastructure, and consumer apps, and across jurisdictions. Scenarios that ignore sector-specific or local legal duties are incomplete.

Annotate: tag scenarios with sector and key jurisdictions, and cross-check against applicable local regulatory regimes before sign-off.

1. What this pack does

Use this cluster when you need to answer a concrete question: “Given this scenario, should we accept, defer, or decline, and on what basis?”

Identifies high-salience misuse patterns for Anthropic-style systems.
Connects those patterns to Anthropic’s RSP and ASL-3 posture.
Provides a structured acceptance table and counsel workboard.

How to start

Begin from a specific scenario, not from the table.
Map actors, stakes, and deployment context using S1–S3.
Then drop the scenario into the acceptance table and workboard views.

2. Components in this pack

These pages are meant to be used together. Start with the cover narrative for framing, then move into the matrix and counsel work views.

Cover & narrative

High-level story of how Anthropic thinks about foreseeable misuse, exposure, and defenses.

Open “Foreseeable Misuse · Cover & narrative”

Acceptance table

Scenario-by-scenario matrix for Accept / Defer / Decline decisions, mapped to policies, terms, and mitigations.

Open “Foreseeable Misuse · Acceptance table”

Counsel work view

Workboard-style view aimed at counsel and risk teams, with prompts for documentation, escalation, and record-keeping.

Open “Foreseeable Misuse · Counsel work view”

3. RPE watchpoints for foreseeable misuse

The Risk-Prediction Engine (RPE) flags a small set of recurring failure modes in how foreseeable misuse scenarios are framed and used. Treat these as guardrails when drafting or applying scenarios in this pack.

Scenario-evidence drift

Scenarios that reference Anthropic policies, terms, or evaluations must point to current versions and avoid over-interpreting earlier findings.
- Require explicit upstream artifact identifiers (e.g., RSP version label, eval report citation).
- Flag scenarios that reference superseded versions or untracked 'live' docs.
Over-reliance on disclaimers

Counsel may treat disclaimers as a universal fix even where scenarios should be deferred or declined outright.
- Ensure the acceptance table includes an explicit Decline/Defer lane alongside 'Accept with disclaimer'.
- Require short rationale text for each 'Accept with disclaimer' entry explaining why stronger measures are not chosen.
Welfare inference overreach

Do not treat current welfare assessments as proof of stable moral status or long-term wellbeing of models.
- Route welfare questions to the Model Evals & Welfare handout and underlying evaluation report.
- Add explicit uncertainty notes wherever welfare or 'feelings' language is discussed in scenarios.
Jurisdictional and sectoral blind spots

Scenarios may under-specify regulatory or sector-specific duties (e.g., finance, health, critical infrastructure).
- Encourage counsel to annotate scenarios with jurisdiction and sector tags.
- Add notes prompting cross-check against applicable local legal and regulatory regimes.

How to apply these watchpoints

For each scenario, ask which watchpoints are most likely to bite.
Adjust the scenario description, evidence, and acceptance decision accordingly.
Where welfare or moral-status questions arise, route to the Model Evals & Welfare handout.

4. How this pack connects to other bundles

S1–S6 Client Brief. Use S1–S3 to surface actors, stakes, and deployment context before you classify scenarios here.
Policies & Overlays. Use the Policies & Overlays index to confirm you are looking at the current RSP, ASL-3, terms, and evaluation reports.

Penumbral Privacy Spine. Where scenarios raise constitutional or privacy-adjacent risks, consult the Penumbral spine for deeper analysis and evidence.
Model Evals & Welfare. If a scenario depends on strong claims about model welfare or “feelings”, cross-check against the Model Evals & Welfare handout and underlying evaluation report.

Foreseeable Misuse mini-pack

Scenario–evidence drift

Over-reliance on disclaimers

Welfare inference overreach

Jurisdiction & sector blind spots

1. What this pack does

How to start

2. Components in this pack

Cover & narrative

Acceptance table

Counsel work view

3. RPE watchpoints for foreseeable misuse

Scenario-evidence drift

Over-reliance on disclaimers

Welfare inference overreach

Jurisdictional and sectoral blind spots

How to apply these watchpoints

4. How this pack connects to other bundles