Best Practice

This guide provides essential best practices for developers integrating with the Otto Schmidt Legal Data Hub APIs. Following these guidelines ensures optimal search quality, performance, and user experience.

Reading time: 3-5 minutes


Table of Contents


1. Choose the Right Endpoint for Your Use Case

The Legal Data Hub provides two main endpoints, each optimized for different scenarios:

Best for: Document discovery, building comprehensive research, finding relevant precedents

Use when: You need to retrieve a larger number of relevant documents, build citation lists, explore a topic broadly, or want to build LLM responses yourself.

Chat (/chatbot)

Best for: Getting explanations, understanding procedures, obtaining answered questions

Use when: You need a synthesized answer with reasoning, not just source documents.


2. Query Formulation: The Most Critical Success Factor

Query quality directly impacts result relevance. The Legal Data Hub uses multiple NLP (Natural Language Processing) models during search. These are trained on vast amounts of internet data, and tend to work better with natural human language.

✅ Best Practices

Write fully-formulated German sentences:

Keep queries focused on a single legal topic:

Be specific, not vague:

For comparative queries, explicitly ask for differences:

Why This Matters

Language models trained on natural text understand context, relationships, and implied requirements. Well-formulated questions enable the system to retrieve precisely the information you need.


3. Understand and Select the Right Data Assets

Data assets define the scope of your legal research and align with Otto Schmidt Online subscriptions. Before integrating, ensure you have acquired the correct data assets (modules) through your subscription to access the content you need.

Discover Available Assets

Pro Tip: It's enough if you search for available data assets once. Only when you order new data assets this is going to change.

Selection Strategies

Wildcard "*" (Broad Research):

Best for:

  • Exploratory research, cross-domain legal issues or when the relevant domain is unclear.

  • If you dynamically update your data asset subscriptions, there is no need to update the code if you use the wildcard "*".

Specific Module (Focused Research):

Best for: Deep expertise in a specific legal area, domain-focused applications. E.g. the end user of your app has the possibility to choose module himself.


Filters narrow results based on document metadata, enabling targeted legal research.

Common Filter Patterns

Filter by Document Type:

Filter by Court:

Filter by Legal Reference (Phrase Matching):

Filter by Date Range (Recent Judgments):

Date format is ISO 8601 (YYYY-MM-DD). Operators: gte (>=), lte (<=), gt (>), lt (<).

Combined Filter (Court + Date + Document Type):

This finds BAG judgments from 2020 onwards on extraordinary termination grounds.

Key Metadata Fields

Field
Purpose
Example

metadata.dokumententyp.keyword

Document type

"Kommentar", "Urteil", "Gesetz"

metadata.gericht.keyword

Court/organization

"BGH", "BAG", "BFH"

metadata.normenkette

Legal reference

"BGB 535", "KSchG 1"


5. Optimize Results with Post-Reranking

Post-reranking uses advanced semantic models to reorder search results for better relevance.

When to Enable post_reranking = true

Use when:

  • Complex queries with nuanced intent

  • Quality is more important than speed

  • Initial results need better relevance ranking

Avoid when:

  • Time Performance is critical

  • Your search query is very generic

Remember: Reranking effectiveness depends heavily on well-formulated queries (see Section 2).


6. Control Result Quantity with candidates

The candidates parameter for the semantic-search endpoint determines how many results are returned.

Recommendations

The following recommendations are made based on internal benchmarks.

Use Case

Suggested candidates

Chatbot

5-8

Comprehensive research

10-20

Balance: More candidates = broader coverage but slower responses and potentially more noise.


7. Use as a Tool in AI Applications

The Legal Data Hub can be integrated as a tool in AI agent applications, enabling autonomous systems to dynamically query legal information when needed. Modern AI frameworks typically automatically convert function docstrings into tool descriptions that guide language models on when and how to use your tool. Thus

⇒ making comprehensive inline documentation is critical for successful integration.

Example Implementation

How an agent uses this tool: When integrated into an AI agent, this function becomes available as a tool the agent can call autonomously. For example, if a user asks "Was sind die rechtlichen Anforderungen für eine Kündigung?", the agent recognizes this requires legal expertise, automatically calls legal_data_hub_anfrage() with a properly formulated German query following the guidelines in the docstring, and synthesizes the response into its answer.

Agent Frameworks for LDH Integration

The following agent frameworks leverage tools which in turn allows to integrate the OS LDH integration:

Framework
Key Capability
Documentation

LangChain Agents

Flexible agent framework with extensive tool ecosystem and LangGraph for complex workflows

OpenAI Agents SDK

Lightweight, production-ready agent framework with minimal abstractions

LlamaIndex Agents

Specialized for document-centric agentic workflows and RAG applications

CrewAI

Multi-agent collaboration framework for orchestrating role-playing autonomous agents

Google Agent Development Kit (ADK)

Flexible, model-agnostic framework with multi-agent orchestration and deployment flexibility (local, Vertex AI, Cloud Run)

Microsoft Agent Framework

Production-ready framework for building robust, future-proof agentic AI solutions (Python & C#)

Important: All frameworks automatically use function docstrings as tool descriptions. Invest time in writing comprehensive documentation (as shown above)—it directly impacts how well the AI understands when and how to use your tool.

Pro tip: Reference Section 2 (Query Formulation) when writing tool documentation to ensure AI models generate queries that follow best practices


Quick Reference Checklist

Before deploying your LDH integration, verify:

Last updated