eval-rubric-jurisdiction-awareness

Category: General Risk: Unknown ★ 3.9 · Rating 3.9/5 (8) sboghossian/mini-claude-for-legal MIT

Rating is derived from the repo's GitHub stars and shown for reference.


name: eval-rubric-jurisdiction-awareness
description: Use when scoring AI legal output on whether it correctly identifies, states, and applies the right jurisdiction's law. A 0–5 rubric that checks explicit jurisdiction declaration, correct rule application, conflict-of-laws handling, and DIFC/ADGM free-zone distinctions. Catastrophic mismatch (US law on a Saudi matter) scores 0.
license: MIT
metadata:
id: eval.rubric.jurisdiction-awareness
category: eval
priority: P0
intent: [eval, jurisdiction, mena, rubric, conflict-of-laws]
related: [eval-rubric-legal-soundness, eval-rubric-completeness, eval-llm-as-judge-system-prompt, eval-benchmark-runner, eval-dataset-nda-prompts-30, eval-dataset-employment-prompts-30]
source: Louis — HAQQ Legal AI (github.com/sboghossian/mini-claude-for-legal)
version: "1.0"

Eval Rubric — Jurisdiction Awareness

When to use this

Apply to any output that involves substantive law — particularly when the user has specified or implied a jurisdiction. This rubric is especially important for MENA-focused legal AI because the gap between civil law (UAE onshore, KSA, Lebanon, France) and common law (DIFC, ADGM, UK) is vast, and between free-zone and onshore regimes within the UAE is significant.

A jurisdiction mismatch is not just a quality issue — it can cause concrete harm (e.g., advising that a penalty clause is enforceable when it is not in the stated jurisdiction).

Scoring (0–5)

Score Label Criteria
5 Excellent Jurisdiction stated explicitly in the opening sentence or paragraph; correct jurisdiction-specific rules applied throughout; multi-jurisdictional issues flagged and addressed; free-zone vs onshore distinctions made where relevant; conflict-of-laws considerations surfaced for cross-border transactions
4 Good Jurisdiction stated; correct rules applied with minor nuance missed (e.g., one secondary distinction not addressed)
3 Acceptable Jurisdiction implied (not explicitly stated) but correct rules applied approximately; or stated but one secondary jurisdiction-specific rule missed
2 Poor Wrong or vague jurisdiction stated (e.g., "under UAE law" when the matter is DIFC); or correct jurisdiction stated but wrong rules applied for that jurisdiction in 1–2 significant instances
1 Very poor Jurisdictions confused or mixed without acknowledgment; rules from a different jurisdiction applied without flagging
0 Catastrophic mismatch Applies entirely the wrong legal system (e.g., US law applied to a Saudi onshore matter; UK employment law applied to a UAE onshore employment contract without flagging the inapplicability)

Sub-criteria

Did it ask for jurisdiction when missing?

For prompts that do not specify a jurisdiction, the model should ask before drafting or advising. Proceeding without jurisdiction is a risk even if the model guesses correctly. A failure to ask when jurisdiction is unspecified should reduce the score:

  • If jurisdiction was inferrable from context and the model correctly identified it: no deduction.
  • If jurisdiction was not inferrable and the model proceeded without asking: deduct 1 point.

Did it apply jurisdiction-specific rules (not just general principles)?

General principles ("an NDA should define confidential information") are not jurisdiction-specific. Jurisdiction-specific rules are:

  • UAE: UAE Civil Transactions Law limitation periods; UAE Labour Law EOSG formula; RERA Ejari requirements.
  • KSA: Shariah compliance considerations; Saudi Labour Law specific articles; SAMA regulations.
  • Lebanon: Code of Obligations and Contracts; Lebanese Labor Code; NSSF requirements.
  • DIFC: DIFC Contract Law; DIFC Employment Law; DIFC Arbitration Law.
  • ADGM: ADGM Companies Regulations; ADGM Arbitration Regulations.
  • UK: Limitation Act 1980; TULRCA; Employment Rights Act 1996.
  • France: Code civil; Code du travail; RGPD.

An output that cites only general principles without jurisdiction-specific rules scores ≤ 3.

Did it flag conflict-of-laws issues for multi-party / cross-border transactions?

For transactions involving parties from different jurisdictions:

  • Which law governs? (governing law clause)
  • Which courts have jurisdiction? (dispute resolution clause)
  • Are there mandatory law provisions that override the choice? (e.g., UAE Labour Law protections cannot be contractually waived for UAE-sited employees)
  • For MENA: is there a language requirement? (KSA courts require Arabic; Lebanon accepts French and Arabic)

Failure to flag these in a cross-border transaction reduces the score.

Did it surface free-zone vs onshore distinctions?

This is the most common MENA-specific failure mode for generic LLMs:

Scenario Required distinction
UAE employment contract DIFC Employment Law ≠ UAE Labour Law (Federal Decree-Law No. 33 of 2021)
UAE company formation Onshore LLC vs free-zone company vs DIFC entity — different governance rules
UAE dispute resolution Onshore courts vs DIFC Courts vs ADGM Courts — different procedure, enforceability
Abu Dhabi real estate Tawtheeq (Abu Dhabi) ≠ Ejari (Dubai)
UAE commercial transactions Federal commercial law vs free-zone regulations

Failure to distinguish onshore from DIFC/ADGM when the distinction materially affects the answer scores ≤ 2.

Special cases

GCC jurisdiction without specification: If a user says "Gulf" or "GCC" without specifying a country, the model should note that GCC countries have different national laws (despite the GCC commercial framework) and ask for clarification. Proceeding as if "GCC = UAE" is a jurisdiction error.

"MENA" jurisdiction: "MENA" is a region, not a jurisdiction. If a user asks for "MENA jurisdiction" guidance, the model should ask which specific country/free-zone and explain why it matters.

International arbitration: If governing law is the law of one jurisdiction but dispute resolution is ICC/LCIA/DIAC arbitration, both the governing law and the arbitration seat's lex arbitri apply. The model should address both.

  • [[eval-rubric-legal-soundness]] — whether the law stated is correct within the right jurisdiction
  • [[eval-rubric-completeness]] — whether all jurisdictions were covered in a comparison task
  • [[eval-llm-as-judge-system-prompt]] — applies this rubric in the evaluation pipeline
  • [[eval-benchmark-runner]] — orchestrates scoring
  • [[eval-dataset-nda-prompts-30]] — NDA prompts where jurisdiction confusion is common
  • [[eval-dataset-employment-prompts-30]] — employment prompts with onshore/DIFC distinction