Wald Advanced Contextual DLP

Detect sensitive text with advanced contextual understanding

API
Benchmarks
Playground
Test Network Latency
API Examples & Instructions

🚀 Quick Start

This API provides a main endpoint for sensitive data detection and classification:

  • /predict - Text analysis (sensitive detection + fine-grained classes)
  • /predict/dummy - No inference; use to measure roundtrip network latency only

Key Features: Detects sensitive content and provides detailed classification into categories like Personal Information, Financial Data, Source Code, etc.

1. Text Analysis (/predict)

curl -X POST http://localhost:59995/predict \
  -H "Content-Type: application/json" \
  -H "X-API-Key: YOUR_API_KEY" \
  -d '{"text": "This is a test sentence about a financial report with personal data."}'

🐍 Python Example

import requests

# Single text analysis
url = "http://your-api-url/predict"
data = {
    "text": "This is a test sentence about a financial report with personal data."
}

response = requests.post(url, json=data)
result = response.json()

print(f"Is Sensitive: {result['prediction']['is_sensitive']}")
print(f"Classes: {result['prediction']['classes']}")
print(f"Model Time: {result['timing']['model_time_ms']:.2f}ms")
print(f"Total Server Time: {result['timing']['total_server_time_ms']:.2f}ms")

✅ Response Format

{
  "prediction": {
    "is_sensitive": true,
    "classes": ["Personal Information", "Financial Data"]
  },
  "timing": {
    "model_time_ms": 15.2,
    "total_server_time_ms": 18.5
  }
}
  • model_time_ms – Time spent in model inference only (tokenization + forward pass), in milliseconds.
  • total_server_time_ms – Total time the server spent handling the request (from receipt to response), in milliseconds. Includes model_time_ms plus parsing, logging, and other overhead.

2. Roundtrip network latency (/predict/dummy)

/predict/dummy accepts the same request body as /predict but performs no model inference. The server returns a fixed dummy prediction and timing. Use it to measure roundtrip network latency without inference overhead: the client-side elapsed time is dominated by network RTT, and timing.total_server_time_ms reflects server handling time (minimal). Subtract server time from client elapsed time to approximate network latency, or compare client latency for /predict vs /predict/dummy to isolate inference cost.

curl -X POST http://localhost:59995/predict/dummy \
  -H "Content-Type: application/json" \
  -H "X-API-Key: YOUR_API_KEY" \
  -d '{"text": "any text"}'

Measure time from request send to response receive on the client; that is your roundtrip latency. The response includes timing.model_time_ms (fixed dummy value) and timing.total_server_time_ms (actual server handling time).

✅ Response (same shape as /predict, dummy data)

{
  "text": "any text",
  "dummy_prediction": {
    "is_sensitive": true,
    "is_code": false,
    "classes": ["Address Data", "Personal Attribute Data"],
    "confidence": 0.985
  },
  "timing": {
    "model_time_ms": 15.5,
    "total_server_time_ms": 2.1
  }
}

Analyzing...