🚀 Quick Start
This API provides a main endpoint for sensitive data detection and classification:
- /predict - Text analysis (sensitive detection + fine-grained classes)
- /predict/dummy - No inference; use to measure roundtrip network latency only
Key Features: Detects sensitive content and provides detailed classification into categories like Personal Information, Financial Data, Source Code, etc.
1. Text Analysis (/predict)
curl -X POST http://localhost:59995/predict \
-H "Content-Type: application/json" \
-H "X-API-Key: YOUR_API_KEY" \
-d '{"text": "This is a test sentence about a financial report with personal data."}'
🐍 Python Example
import requests
# Single text analysis
url = "http://your-api-url/predict"
data = {
"text": "This is a test sentence about a financial report with personal data."
}
response = requests.post(url, json=data)
result = response.json()
print(f"Is Sensitive: {result['prediction']['is_sensitive']}")
print(f"Classes: {result['prediction']['classes']}")
print(f"Model Time: {result['timing']['model_time_ms']:.2f}ms")
print(f"Total Server Time: {result['timing']['total_server_time_ms']:.2f}ms")
✅ Response Format
{
"prediction": {
"is_sensitive": true,
"classes": ["Personal Information", "Financial Data"]
},
"timing": {
"model_time_ms": 15.2,
"total_server_time_ms": 18.5
}
}
- model_time_ms – Time spent in model inference only (tokenization + forward pass), in milliseconds.
- total_server_time_ms – Total time the server spent handling the request (from receipt to response), in milliseconds. Includes model_time_ms plus parsing, logging, and other overhead.
2. Roundtrip network latency (/predict/dummy)
/predict/dummy accepts the same request body as /predict but performs no model inference. The server returns a fixed dummy prediction and timing. Use it to measure roundtrip network latency without inference overhead: the client-side elapsed time is dominated by network RTT, and timing.total_server_time_ms reflects server handling time (minimal). Subtract server time from client elapsed time to approximate network latency, or compare client latency for /predict vs /predict/dummy to isolate inference cost.
curl -X POST http://localhost:59995/predict/dummy \
-H "Content-Type: application/json" \
-H "X-API-Key: YOUR_API_KEY" \
-d '{"text": "any text"}'
Measure time from request send to response receive on the client; that is your roundtrip latency. The response includes timing.model_time_ms (fixed dummy value) and timing.total_server_time_ms (actual server handling time).
✅ Response (same shape as /predict, dummy data)
{
"text": "any text",
"dummy_prediction": {
"is_sensitive": true,
"is_code": false,
"classes": ["Address Data", "Personal Attribute Data"],
"confidence": 0.985
},
"timing": {
"model_time_ms": 15.5,
"total_server_time_ms": 2.1
}
}