Best Practice
This guide provides essential best practices for developers integrating with the Otto Schmidt Legal Data Hub APIs. Following these guidelines ensures optimal search quality, performance, and user experience.
Reading time: 3-5 minutes
Table of Contents
1. Choose the Right Endpoint for Your Use Case
The Legal Data Hub provides two main endpoints, each optimized for different scenarios:
Semantic Search (/semantic-search)
/semantic-search)Best for: Document discovery, building comprehensive research, finding relevant precedents
Use when: You need to retrieve a larger number of relevant documents, build citation lists, explore a topic broadly, or want to build LLM responses yourself.
Chat (/chatbot)
/chatbot)Best for: Getting explanations, understanding procedures, obtaining answered questions
Use when: You need a synthesized answer with reasoning, not just source documents.
2. Query Formulation: The Most Critical Success Factor
Query quality directly impacts result relevance. The Legal Data Hub uses multiple NLP (Natural Language Processing) models during search. These are trained on vast amounts of internet data, and tend to work better with natural human language.
✅ Best Practices
Write fully-formulated German sentences:
Keep queries focused on a single legal topic:
Be specific, not vague:
For comparative queries, explicitly ask for differences:
Why This Matters
Language models trained on natural text understand context, relationships, and implied requirements. Well-formulated questions enable the system to retrieve precisely the information you need.
3. Understand and Select the Right Data Assets
Data assets define the scope of your legal research and align with Otto Schmidt Online subscriptions. Before integrating, ensure you have acquired the correct data assets (modules) through your subscription to access the content you need.
Discover Available Assets
Pro Tip: It's enough if you search for available data assets once. Only when you order new data assets this is going to change.
Selection Strategies
Wildcard "*" (Broad Research):
Best for:
Exploratory research, cross-domain legal issues or when the relevant domain is unclear.
If you dynamically update your data asset subscriptions, there is no need to update the code if you use the wildcard
"*".
Specific Module (Focused Research):
Best for: Deep expertise in a specific legal area, domain-focused applications. E.g. the end user of your app has the possibility to choose module himself.
4. Use Filters for Precision Search
Filters narrow results based on document metadata, enabling targeted legal research.
Common Filter Patterns
Filter by Document Type:
Filter by Court:
Filter by Legal Reference (Phrase Matching):
Filter by Date Range (Recent Judgments):
Date format is ISO 8601 (YYYY-MM-DD). Operators: gte (>=), lte (<=), gt (>), lt (<).
Combined Filter (Court + Date + Document Type):
This finds BAG judgments from 2020 onwards on extraordinary termination grounds.
Key Metadata Fields
metadata.dokumententyp.keyword
Document type
"Kommentar", "Urteil", "Gesetz"
metadata.gericht.keyword
Court/organization
"BGH", "BAG", "BFH"
metadata.normenkette
Legal reference
"BGB 535", "KSchG 1"
5. Optimize Results with Post-Reranking
Post-reranking uses advanced semantic models to reorder search results for better relevance.
When to Enable post_reranking = true
post_reranking = true✅ Use when:
Complex queries with nuanced intent
Quality is more important than speed
Initial results need better relevance ranking
❌ Avoid when:
Time Performance is critical
Your search query is very generic
Remember: Reranking effectiveness depends heavily on well-formulated queries (see Section 2).
6. Control Result Quantity with candidates
candidatesThe candidates parameter for the semantic-search endpoint determines how many results are returned.
Recommendations
The following recommendations are made based on internal benchmarks.
Use Case
Suggested candidates
Chatbot
5-8
Comprehensive research
10-20
Balance: More candidates = broader coverage but slower responses and potentially more noise.
7. Use as a Tool in AI Applications
The Legal Data Hub can be integrated as a tool in AI agent applications, enabling autonomous systems to dynamically query legal information when needed. Modern AI frameworks typically automatically convert function docstrings into tool descriptions that guide language models on when and how to use your tool. Thus
Example Implementation
How an agent uses this tool: When integrated into an AI agent, this function becomes available as a tool the agent can call autonomously. For example, if a user asks "Was sind die rechtlichen Anforderungen für eine Kündigung?", the agent recognizes this requires legal expertise, automatically calls legal_data_hub_anfrage() with a properly formulated German query following the guidelines in the docstring, and synthesizes the response into its answer.
Agent Frameworks for LDH Integration
The following agent frameworks leverage tools which in turn allows to integrate the OS LDH integration:
LangChain Agents
Flexible agent framework with extensive tool ecosystem and LangGraph for complex workflows
Google Agent Development Kit (ADK)
Flexible, model-agnostic framework with multi-agent orchestration and deployment flexibility (local, Vertex AI, Cloud Run)
Microsoft Agent Framework
Production-ready framework for building robust, future-proof agentic AI solutions (Python & C#)
Important: All frameworks automatically use function docstrings as tool descriptions. Invest time in writing comprehensive documentation (as shown above)—it directly impacts how well the AI understands when and how to use your tool.
Pro tip: Reference Section 2 (Query Formulation) when writing tool documentation to ensure AI models generate queries that follow best practices
Quick Reference Checklist
Before deploying your LDH integration, verify:
Last updated