Abstract
Large language models are increasingly used for legal research, drafting, and analysis, but their answers can be inconsistent, hard to verify, and sensitive to small changes in prompt wording. This paper introduces a structured prompting framework for applying large language models to five core legal tasks: statutory interpretation, contract review, case summarization, legal question-answering, and clause extraction. The framework helps draft prompts using task-specific templates, role-based instructions, example-based guidance, chain-of-thought reasoning and context-layered reasoning, and is paired with an evaluation that measures exact match, F1, ROUGE-L, macro F1, and a legal hallucination rate defined through a rubric for unsupported or incorrect legal claims. To contextualize these results, a Retrieval-Augmented Generation baseline is implemented and evaluated on the same tasks. Experiments across multiple state-of-the-art models show that the structured prompting framework consistently improves accuracy and stability over less structured prompts and over the Retrieval-Augmented Generation baseline in this setting, with the largest gains for reasoning-intensive tasks such as statutory interpretation and legal question-answering. Context-layered and chain-of-thought prompts yield the highest scores and the lowest hallucination rates overall. Statistical analysis over multiple runs, using mean, variance, standard deviation, and 95% confidence intervals, indicates that the best model–prompt combinations are both strong and stable. The findings offer practical guidance on which structured prompting strategies are most effective for different categories of legal tasks.
| Original language | English |
|---|---|
| Journal | IEEE Access |
| Volume | 14 |
| Pages (from-to) | 3108-3129 |
| ISSN | 2169-3536 |
| DOIs | |
| Publication status | Published - 2026 |
Keywords
- Chain-of-thought
- Legal AI
- Prompt engineering
- Retrieval-augmented generation
- Role-based prompting
Fingerprint
Dive into the research topics of 'A Reusable Prompting Framework for Applying Large Language Models to Legal Tasks'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver