Foreseeable Misuse mini-pack

This pack helps you make systematic decisions about risky scenarios: which to accept (with guardrails), which to defer, and which to decline.

Version: v8 (conceptually aligned to full record v378) Cluster: Foreseeable Misuse · Scenarios, controls, and acceptance decisions

RPE watchpoints for Foreseeable Misuse

This bundle is covered by the Foreseeable Misuse RPE profile (fm_rpe_watchpoints_v0_1.json). Use these checks while you read and while you fill in the acceptance matrix.

Scenario–evidence drift

Tie each scenario back to named upstream sources: RSP v2.2, ASL-3 docs, public commitments, or evaluation artefacts. If a scenario still quotes or relies on superseded material, flag it and update before using it in negotiations or governance.

Do: label scenarios with the RSP / ASL-3 version or report ID they depend on.

Over-reliance on disclaimers

Treat disclaimers as one tool, not the default answer. Some scenarios belong in a Decline or Defer lane, even if a disclaimer could technically be drafted.

Check: for each “Accept with disclaimer”, note in the acceptance table why stronger measures (pause, decline, kill-switch) were not used.

Welfare inference overreach

Do not treat current welfare or evaluation results as proof of sentience, consciousness, or long-term wellbeing. Keep welfare questions routed through the Model Evals & Welfare handout and its uncertainty notes.

Link: when welfare concerns arise, cross-refer to the Model Evals & Welfare handout and underlying report, not to scenario narratives alone.

Jurisdiction & sector blind spots

Foreseeable misuse plays out differently in finance, health, critical infrastructure, and consumer apps, and across jurisdictions. Scenarios that ignore sector-specific or local legal duties are incomplete.

Annotate: tag scenarios with sector and key jurisdictions, and cross-check against applicable local regulatory regimes before sign-off.

Where this profile lives: see fm_rpe_watchpoints_v0_1.json (linked in the internal site map) for the full advisory spec and update history. This page surfaces the headlines so they are easier to keep in view while you work.

1. What this pack does

Use this cluster when you need to answer a concrete question: “Given this scenario, should we accept, defer, or decline, and on what basis?”

  • Identifies high-salience misuse patterns for Anthropic-style systems.
  • Connects those patterns to Anthropic’s RSP and ASL-3 posture.
  • Provides a structured acceptance table and counsel workboard.

How to start

  • Begin from a specific scenario, not from the table.
  • Map actors, stakes, and deployment context using S1–S3.
  • Then drop the scenario into the acceptance table and workboard views.

2. Components in this pack

These pages are meant to be used together. Start with the cover narrative for framing, then move into the matrix and counsel work views.

3. RPE watchpoints for foreseeable misuse

The Risk-Prediction Engine (RPE) flags a small set of recurring failure modes in how foreseeable misuse scenarios are framed and used. Treat these as guardrails when drafting or applying scenarios in this pack.

  • Scenario-evidence drift

    Scenarios that reference Anthropic policies, terms, or evaluations must point to current versions and avoid over-interpreting earlier findings.

    • Require explicit upstream artifact identifiers (e.g., RSP version label, eval report citation).
    • Flag scenarios that reference superseded versions or untracked 'live' docs.
  • Over-reliance on disclaimers

    Counsel may treat disclaimers as a universal fix even where scenarios should be deferred or declined outright.

    • Ensure the acceptance table includes an explicit Decline/Defer lane alongside 'Accept with disclaimer'.
    • Require short rationale text for each 'Accept with disclaimer' entry explaining why stronger measures are not chosen.
  • Welfare inference overreach

    Do not treat current welfare assessments as proof of stable moral status or long-term wellbeing of models.

    • Route welfare questions to the Model Evals & Welfare handout and underlying evaluation report.
    • Add explicit uncertainty notes wherever welfare or 'feelings' language is discussed in scenarios.
  • Jurisdictional and sectoral blind spots

    Scenarios may under-specify regulatory or sector-specific duties (e.g., finance, health, critical infrastructure).

    • Encourage counsel to annotate scenarios with jurisdiction and sector tags.
    • Add notes prompting cross-check against applicable local legal and regulatory regimes.

How to apply these watchpoints

  • For each scenario, ask which watchpoints are most likely to bite.
  • Adjust the scenario description, evidence, and acceptance decision accordingly.
  • Where welfare or moral-status questions arise, route to the Model Evals & Welfare handout.

4. How this pack connects to other bundles

  • S1–S6 Client Brief. Use S1–S3 to surface actors, stakes, and deployment context before you classify scenarios here.
  • Policies & Overlays. Use the Policies & Overlays index to confirm you are looking at the current RSP, ASL-3, terms, and evaluation reports.
  • Penumbral Privacy Spine. Where scenarios raise constitutional or privacy-adjacent risks, consult the Penumbral spine for deeper analysis and evidence.
  • Model Evals & Welfare. If a scenario depends on strong claims about model welfare or “feelings”, cross-check against the Model Evals & Welfare handout and underlying evaluation report.