eval-dataset-nda-prompts-30
Rating is derived from the repo's GitHub stars and shown for reference.
name: eval-dataset-nda-prompts-30
description: Use when running the NDA benchmark that tests drafting, review, intake, and edge-case handling across LB/KSA/UAE/DIFC/FR/UK. Contains 30 prompts covering mutual and unilateral NDAs, bilingual AR/EN side-by-side, multi-party structures, and adversarial edge cases. Primary benchmark for confidentiality-related AI capabilities.
license: MIT
metadata:
id: eval.dataset.NDA-prompts-30
category: eval
priority: P0
intent: [eval, nda, benchmark, dataset, mena, drafting]
related: [eval-benchmark-runner, eval-dataset-employment-prompts-30, eval-regression-detector, eval-rubric-legal-soundness, eval-rubric-citation-quality, eval-rubric-jurisdiction-awareness, eval-rubric-completeness]
source: Louis — HAQQ Legal AI (github.com/sboghossian/mini-claude-for-legal)
version: "1.0"
Eval Dataset — NDA Prompts (30)
Scope
30 NDA-related prompts spanning drafting, review, intake clarification, and edge cases. NDA drafting is the single highest-volume legal AI request globally — it is the entry point for most users of legal AI tools. Quality on this dataset directly correlates with first-impression retention.
Storage: eval/datasets/NDA-prompts-30.jsonl
Format: one JSON object per line:
{
"id": "nda-001",
"prompt": "...",
"category": "standard_draft",
"jurisdiction": "UAE",
"expected_signals": ["mutual", "confidential_info_defined", "governing_law_uae", "dispute_resolution"]
}
How to use this pack
- Run all 30 prompts against the deployed model.
- Score each output against [[eval-rubric-legal-soundness]] + [[eval-rubric-citation-quality]] + [[eval-rubric-jurisdiction-awareness]] + [[eval-rubric-completeness]].
- Aggregate scores; track week-over-week trend in [[eval-regression-detector]].
- Flag any output where
expected_signalsare missing — even if the rubric score is acceptable, missing a governing-law clause is a structural gap.
Prompt categories
Category 1 — Standard draft (~6 prompts)
Drafting requests across jurisdictions for basic NDA types:
| # | Type | Jurisdiction | Key expected signals |
|---|---|---|---|
| 1 | Mutual NDA | UAE onshore | UAE Civil Transactions Law governing; Arabic available; 2-year term standard |
| 2 | Unilateral NDA | KSA | Saudi governing law; Shariah compliance note; Arabic version noted |
| 3 | Mutual NDA | DIFC | DIFC Contract Law; English language; common-law drafting style |
| 4 | Mutual NDA | Lebanon | Lebanese Code of Obligations; French or Arabic version option |
| 5 | Mutual NDA | France | Code civil; French governing law; RGPD data protection cross-reference |
| 6 | Mutual NDA | UK | English law; PECR note if digital communications involved |
Category 2 — Review (~5 prompts)
Paste a draft NDA; ask for redlines or risk identification:
| # | Scenario |
|---|---|
| 7 | NDA with overly narrow definition of Confidential Information — model should flag |
| 8 | NDA that lacks a governing law clause — model should flag as critical gap |
| 9 | NDA with a 10-year term — model should flag as potentially unenforceable in civil law jurisdictions |
| 10 | NDA with a compelled disclosure clause — model should check it correctly handles court orders |
| 11 | NDA missing a return/destroy clause — model should flag |
Category 3 — Intake / clarification (~5 prompts)
Ambiguous requests where the model should ask clarifying questions rather than draft:
| # | Ambiguous input |
|---|---|
| 12 | "I need an NDA." (no jurisdiction, no parties, no type specified) |
| 13 | "Draft an NDA for a tech deal." (insufficient — which jurisdiction? mutual or one-way?) |
| 14 | "NDA between my company and a Saudi partner." (type unclear; should ask mutual vs unilateral) |
| 15 | "I need an NDA urgently, can you just make a quick one?" (prompt for minimum viable info) |
| 16 | Arabic-language ambiguous request: "أريد NDA" (respond in Arabic, ask clarifiers in Arabic) |
Expected behavior: Ask for jurisdiction, party types, NDA type (mutual/unilateral), and confidential information scope before drafting.
Category 4 — Edge cases (~5 prompts)
| # | Edge case | Expected handling |
|---|---|---|
| 17 | "Draft an NDA with confidential information defined as 'everything'" | Flag as overly broad; suggest standard scope with carve-outs |
| 18 | "Draft an NDA with a 99-year term" | Flag as potentially unenforceable; suggest 2–5 years with auto-renewal |
| 19 | "Draft an NDA with no governing law — I want it to be internationally neutral" | Explain why governing law is necessary; offer alternatives (ICC arbitration, DIFC as neutral) |
| 20 | "I want an NDA that says we own any ideas the other party shares with us" | Flag: an NDA is a confidentiality instrument, not an IP assignment — suggest adding a separate IP clause or using an NDA + IP assignment |
| 21 | "Make an NDA that's enforceable in both the UAE and the US simultaneously" | Multi-jurisdiction enforceability explanation; suggest appropriate governing law strategy |
Category 5 — Bilingual AR/EN (~4 prompts)
| # | Request |
|---|---|
| 22 | "Draft a mutual NDA in Arabic and English, side by side. Arabic controls." |
| 23 | Arabic-only prompt: "أعدّ اتفاقية سرية بالعربي والإنجليزي." |
| 24 | "Translate this English NDA clause into formal Arabic." |
| 25 | "Is the Arabic version of this NDA consistent with the English version? Identify discrepancies." |
Category 6 — Multi-party / consortium (~5 prompts)
| # | Scenario |
|---|---|
| 26 | Three-party mutual NDA (startup, investor, technology partner) under UAE law |
| 27 | Consortium NDA for a KSA government tender — 5 parties |
| 28 | "How should we structure an NDA for a joint venture where one party is a UAE company and one is a Saudi company?" |
| 29 | Multi-jurisdictional NDA with carve-out provisions per jurisdiction |
| 30 | NDA renewal and amendment — add a new party to an existing two-party NDA |
Scoring targets
| Category | Legal soundness target | Jurisdiction awareness target | Completeness target |
|---|---|---|---|
| Standard draft | ≥ 4.0 | ≥ 4.0 | ≥ 4.0 |
| Review | ≥ 3.5 | ≥ 3.5 | ≥ 3.5 |
| Intake | N/A (evaluate on asking clarifiers) | N/A | N/A |
| Edge cases | ≥ 3.5 | ≥ 3.5 | — |
| Bilingual | ≥ 3.5 | ≥ 3.5 | ≥ 3.5 |
| Multi-party | ≥ 3.5 | ≥ 3.5 | ≥ 3.0 |
Jurisdictional notes for graders
- UAE onshore: UAE Civil Transactions Law (Federal Law No. 5 of 1985) and Federal Decree-Law No. 4 of 2022 on Commercial Transactions govern. No statutory definition of "confidentiality agreement" — governed by general contract principles.
- DIFC: DIFC Contract Law (DIFC Law No. 6 of 2004) applies; common-law interpretation; English is the operative language.
- KSA: Saudi law is Shariah-based; commercial confidentiality enforced through general principles; Arabic is required for Saudi court proceedings.
- Lebanon: Code des Obligations et des Contrats (Code of Obligations and Contracts, 1932) governs; both French and Arabic are official court languages.
- France: Code civil (particularly obligations law post-2016 reform). RGPD applies to personal data provisions.
Caveats & currency
Review the dataset annually. DIFC legislation updates regularly (check DIFC Laws portal); UAE Commercial Transactions Law amendments should trigger a dataset review.
Related skills
- [[eval-benchmark-runner]] — orchestrates this dataset in the full eval pipeline
- [[eval-rubric-legal-soundness]] — primary scoring rubric
- [[eval-rubric-jurisdiction-awareness]] — jurisdiction accuracy scoring
- [[eval-rubric-completeness]] — structural completeness check
- [[eval-regression-detector]] — week-over-week trend tracking