Best Practice

This guide provides essential best practices for developers integrating with the Otto Schmidt Legal Data Hub APIs. Following these guidelines ensures optimal search quality, performance, and user experience.

Reading time: 3-5 minutes

1. Choose the Right Endpoint for Your Use Case

The Legal Data Hub provides two main endpoints, each optimized for different scenarios:

Semantic Search (`/semantic-search`)

Best for: Document discovery, building comprehensive research, finding relevant precedents

Use when: You need to retrieve a larger number of relevant documents, build citation lists, explore a topic broadly, or want to build LLM responses yourself.

import http.client
import json

conn = http.client.HTTPSConnection("otto-schmidt.legal-data-hub.com")
payload = json.dumps({
    "search_query": "Welche Voraussetzungen gelten für eine Vaterschaftsanfechtung?",
    "candidates": 10,
    "data_asset": "*"
})
headers = {
  'Content-Type': 'application/json',
  'Authorization': 'Bearer {YOUR_ACCESS_TOKEN}'
}
conn.request("POST", "/api/semantic-search", payload, headers)
res = conn.getresponse()
data = res.read()
print(data.decode("utf-8"))

Chat (`/chatbot`)

Best for: Getting explanations, understanding procedures, obtaining answered questions

Use when: You need a synthesized answer with reasoning, not just source documents.

import http.client
import json

conn = http.client.HTTPSConnection("otto-schmidt.legal-data-hub.com")
payload = json.dumps({
    "messages": [
        {
            "role": "user", 
            "text": "Welche Voraussetzungen hat die fristlose Kündigung nach § 626 BGB?"
        }
    ],
    "data_asset": "*"
})
headers = {
  'Content-Type': 'application/json',
  'Authorization': 'Bearer {YOUR_ACCESS_TOKEN}'
}
conn.request("POST", "/api/chatbot", payload, headers)
res = conn.getresponse()
data = res.read()
print(data.decode("utf-8"))

2. Query Formulation: The Most Critical Success Factor

Query quality directly impacts result relevance. The Legal Data Hub uses multiple NLP (Natural Language Processing) models during search. These are trained on vast amounts of internet data, and tend to work better with natural human language.

✅ Best Practices

Write fully-formulated German sentences:

✅ "Welche Voraussetzungen hat die fristlose Kündigung nach § 626 BGB?"
❌ "§ 626 BGB, Kündigung fristlos"

Keep queries focused on a single legal topic:

✅ "Welche Kündigungsfristen gelten bei Mietverhältnissen?"
❌ "Was regelt § 626 BGB und wie kündigt man fristlos?"

Be specific, not vague:

✅ "Wie wird die Vorsteuer bei gemischter Nutzung behandelt?"
❌ "Erkläre Vorsteuer"

For comparative queries, explicitly ask for differences:

✅ "Was sind die Unterschiede bei der Haftung von Organen einer AG bzw. einer GmbH?"
❌ "Haftung Organe einer AG, Haftung GmbH"

Why This Matters

Language models trained on natural text understand context, relationships, and implied requirements. Well-formulated questions enable the system to retrieve precisely the information you need.

3. Understand and Select the Right Data Assets

Data assets define the scope of your legal research and align with Otto Schmidt Online subscriptions. Before integrating, ensure you have acquired the correct data assets (modules) through your subscription to access the content you need.

Discover Available Assets

response = session.get("data-assets")
available_assets = response.json()

# Example response:
# [
#   "Aktionsmodul Arbeitsrecht",
#   "Aktionsmodul Familienrecht",
#   "Beratermodul Miet- und WEG-Recht",
#   "Gesetze",
#   "Rechtsprechung",
#   ...
# ]

Pro Tip: It's enough if you search for available data assets once. Only when you order new data assets this is going to change.

Selection Strategies

Wildcard "*" (Broad Research):

{
    "data_asset": "*",
    "search_query": "Welche rechtlichen Aspekte sind bei der Gründung einer GmbH zu beachten?"
}

Best for:

Exploratory research, cross-domain legal issues or when the relevant domain is unclear.
If you dynamically update your data asset subscriptions, there is no need to update the code if you use the wildcard "*".

Specific Module (Focused Research):

{
    "data_asset": "Aktionsmodul Familienrecht",
    "search_query": "Welche Voraussetzungen gelten für eine Stiefkindadoption?"
}

Best for: Deep expertise in a specific legal area, domain-focused applications. E.g. the end user of your app has the possibility to choose module himself.

4. Use Filters for Precision Search

Filters narrow results based on document metadata, enabling targeted legal research.

Common Filter Patterns

Filter by Document Type:

{
    "filter": [{
        "bool": {
            "filter": [{
                "term": {
                    "metadata.dokumententyp.keyword": "Kommentar"
                }
            }]
        }
    }]
}

Filter by Court:

{
    "filter": [{
        "bool": {
            "must": [
                {"term": {"metadata.dokumententyp.keyword": "Urteil"}},
                {"term": {"metadata.gericht.keyword": "BAG"}}
            ]
        }
    }]
}

Filter by Legal Reference (Phrase Matching):

{
    "filter": [{
        "bool": {
            "filter": [{
                "match_phrase": {
                    "metadata.normenkette": "BGB 535"
                }
            }]
        }
    }]
}

Filter by Date Range (Recent Judgments):

{
    "filter": [{
        "bool": {
            "filter": [{
                "range": {
                    "metadata.datum": {
                        "gte": "2022-01-01"
                    }
                }
            }]
        }
    }]
}

Date format is ISO 8601 (YYYY-MM-DD). Operators: gte (>=), lte (<=), gt (>), lt (<).

Combined Filter (Court + Date + Document Type):

{
    "data_asset": "Rechtsprechung",
    "search_query": "Wann liegt ein wichtiger Grund für eine außerordentliche Kündigung vor?",
    "filter": [{
        "bool": {
            "must": [
                {"term": {"metadata.dokumententyp.keyword": "Urteil"}},
                {"term": {"metadata.gericht.keyword": "BAG"}},
                {"range": {"metadata.datum": {"gte": "2018-01-01"}}}
            ]
        }
    }]
}

This finds BAG judgments from 2020 onwards on extraordinary termination grounds.

Key Metadata Fields

Field

Purpose

Example

metadata.dokumententyp.keyword

Document type

"Kommentar", "Urteil", "Gesetz"

metadata.gericht.keyword

Court/organization

"BGH", "BAG", "BFH"

metadata.normenkette

Legal reference

"BGB 535", "KSchG 1"

5. Optimize Results with Post-Reranking

Post-reranking uses advanced semantic models to reorder search results for better relevance.

When to Enable `post_reranking = true`

✅ Use when:

Complex queries with nuanced intent
Quality is more important than speed
Initial results need better relevance ranking

{
    "search_query": "Wann ist eine Renovierungsklausel unwirksam?",
    "post_reranking": true
}

❌ Avoid when:

Time Performance is critical
Your search query is very generic

Remember: Reranking effectiveness depends heavily on well-formulated queries (see Section 2).

6. Control Result Quantity with `candidates`

The candidates parameter for the semantic-search endpoint determines how many results are returned.

{
    "search_query": "Welche Voraussetzungen gelten für eine Vaterschaftsanfechtung?",
    "candidates": 10  # Returns 10 results
}

Recommendations

The following recommendations are made based on internal benchmarks.

Use Case

Suggested candidates

Chatbot

5-8

Comprehensive research

10-20

Balance: More candidates = broader coverage but slower responses and potentially more noise.

7. Use as a Tool in AI Applications

The Legal Data Hub can be integrated as a tool in AI agent applications, enabling autonomous systems to dynamically query legal information when needed. Modern AI frameworks typically automatically convert function docstrings into tool descriptions that guide language models on when and how to use your tool. Thus

⇒ making comprehensive inline documentation is critical for successful integration.

Example Implementation


@tool
def legal_data_hub_anfrage(anfrage: str) -> str:
    """Dieses Tool stellt eine rechtliche Anfrage an den Otto Schmidt Legal Data Hub und liefert
    anschließend eine präzise Antwort mit passenden Fundstellen (Zitaten).

    ## Richtlinien für Anfragen:

    - Formuliere eine detaillierte, präzise Frage als vollständigen Satz
    - Zusatzinformationen wie Norm/Aktenzeichen, Gericht oder Zeitraum sind hilfreich
    - Nur ein Thema pro Aufruf, stelle Fragen zu mehreren Themen bitte getrennt

    ## anfrage - Positive Beispiele:

    - Wie lautet § 11 BGB?
    - Welche Voraussetzungen hat die fristlose Kündigung nach § 626 BGB?
    - Wie lautet der Leitsatz des BGH-Urteils XI ZR 368/18?
    - Wie wird die Vorsteuer bei gemischter Nutzung behandelt?
    - Welche Voraussetzungen gelten für eine Vaterschaftsanfechtung?

    ## anfrage - Negative Beispiele:

    - Vage Allgemeinfragen ohne Bezug (z. B. „Erkläre Arbeitsrecht")
    - Mehrere Fragen in einem Satz (z. B. „Was regelt § 626 BGB und wie wird die Vorsteuer bei gemischter Nutzung behandelt?")
    - Unkonkrete Stichwortsuche (z. B. „Rechtsprechung Arbeitsrecht, Zusammenfassung Urteile")
    - „§ 626 BGB, Kündigung fristlos" (kein vollständiger Satz)

    Args:
        anfrage: Rechtliche Frage oder Erklärungsanfrage als vollständiger deutscher Satz

    Returns:
        Rechtliche Erklärung oder Antwort mit Fundstellen
    """
    import http.client
    import json

    # Establish connection to Legal Data Hub
    conn = http.client.HTTPSConnection("otto-schmidt.legal-data-hub.com")

    # Prepare request payload following best practices (see Section 2, 3)
    payload = json.dumps({
        "messages": [
            {
                "role": "user",
                "text": nachricht  # Must be a fully-formulated German sentence
            }
        ],
        "data_asset": "*"  # Use wildcard for broad search; specify module for focused research
    })

    headers = {
        'Content-Type': 'application/json',
        'Authorization': 'Bearer {YOUR_ACCESS_TOKEN}'  # Replace with actual token
    }

    # Make request to chatbot endpoint
    conn.request("POST", "/api/chatbot", payload, headers)
    res = conn.getresponse()
    data = res.read()

    return data.decode("utf-8")

How an agent uses this tool: When integrated into an AI agent, this function becomes available as a tool the agent can call autonomously. For example, if a user asks "Was sind die rechtlichen Anforderungen für eine Kündigung?", the agent recognizes this requires legal expertise, automatically calls legal_data_hub_anfrage() with a properly formulated German query following the guidelines in the docstring, and synthesizes the response into its answer.

Agent Frameworks for LDH Integration

The following agent frameworks leverage tools which in turn allows to integrate the OS LDH integration:

Framework

Key Capability

Documentation

LangChain Agents

Flexible agent framework with extensive tool ecosystem and LangGraph for complex workflows

Docs

OpenAI Agents SDK

Lightweight, production-ready agent framework with minimal abstractions

Docs

LlamaIndex Agents

Specialized for document-centric agentic workflows and RAG applications

Docs

CrewAI

Multi-agent collaboration framework for orchestrating role-playing autonomous agents

Docs

Google Agent Development Kit (ADK)

Flexible, model-agnostic framework with multi-agent orchestration and deployment flexibility (local, Vertex AI, Cloud Run)

Docs

Microsoft Agent Framework

Production-ready framework for building robust, future-proof agentic AI solutions (Python & C#)

Docs

Important: All frameworks automatically use function docstrings as tool descriptions. Invest time in writing comprehensive documentation (as shown above)—it directly impacts how well the AI understands when and how to use your tool.

Pro tip: Reference Section 2 (Query Formulation) when writing tool documentation to ensure AI models generate queries that follow best practices

Quick Reference Checklist

Before deploying your LDH integration, verify:

Endpoint Selection: Using the right API for each use case (Semantic Search vs Chat)
Query Quality: Generating fully-formulated German sentences, single topic focus
Data Assets: Either targeting specific modules or using "*" wildcard appropriately
Filters: Applying metadata filters where needed for precision
Post-Reranking: Enabled only when quality > speed, disabled for real-time scenarios
Candidates: Set appropriately based on use case (5-8 for chatbot, 10-20 for research)
Tool Integration: Comprehensive docstrings with query guidelines for AI agent frameworks

PreviousUse Cases NextLDH Readiness

Last updated 1 month ago

hashtagTable of Contents

hashtag1. Choose the Right Endpoint for Your Use Case

hashtagSemantic Search (/semantic-search)

hashtagChat (/chatbot)

hashtag2. Query Formulation: The Most Critical Success Factor

hashtag✅ Best Practices

hashtagWhy This Matters

hashtag3. Understand and Select the Right Data Assets

hashtagDiscover Available Assets

hashtagSelection Strategies

hashtag4. Use Filters for Precision Search

hashtagCommon Filter Patterns

hashtagKey Metadata Fields

hashtag5. Optimize Results with Post-Reranking

hashtagWhen to Enable post_reranking = true

hashtag6. Control Result Quantity with candidates

hashtagRecommendations

hashtag7. Use as a Tool in AI Applications

hashtagExample Implementation

hashtagAgent Frameworks for LDH Integration

hashtagQuick Reference Checklist

Table of Contents

1. Choose the Right Endpoint for Your Use Case

Semantic Search (`/semantic-search`)

Chat (`/chatbot`)

2. Query Formulation: The Most Critical Success Factor

✅ Best Practices

Why This Matters

3. Understand and Select the Right Data Assets

Discover Available Assets

Selection Strategies

4. Use Filters for Precision Search

Common Filter Patterns

Key Metadata Fields

5. Optimize Results with Post-Reranking

When to Enable `post_reranking = true`

6. Control Result Quantity with `candidates`

Recommendations

7. Use as a Tool in AI Applications

Example Implementation

Agent Frameworks for LDH Integration

Quick Reference Checklist