Parses container IDs, notify parties, and HS codes from Maersk, MSC, and CMA CGM. Push directly to CargoWise or Flexport.
// Extracted BOL Data
{
"bl_number": "MSCU9872341-001",
"container_id": "MSCU9872341" ✓ 99.8%
"container_valid": true,
"type": "40HC",
"weight_kg": 28459,
"shipper": "ACME Manufacturing Co.",
"consignee": "Global Imports Inc.",
"port_load": "CNSHA",
"port_discharge": "USLGB"
}
bl_number,container_id,type,weight_kg,shipper,consignee
MSCU9872341-001,MSCU9872341,40HC,28459,"ACME Manufacturing","Global Imports"
POST /api/v1/shipments
Authorization: Bearer ••••••••
{
"system": "cargowise",
"payload": { ... }
}
→ 201 Created
Autodetects running lists of goods, weights, and seal numbers.
We checksum every container ID against ISO standards to prevent errors.
Webhooks push data directly to your operating system.
PROCESSING_PIPELINE
Accepts raw files via API, Email, or SFTP upload.
Maps unstructured pixels to structured JSON schemas.
Delivers clean data directly to your operating system.
SCHEMA_CATALOG
Comparative performance metrics across document processing approaches.
| Metric | Manual Entry | Legacy OCR / Zonal | PDF2TEXT Neural |
|---|---|---|---|
| Handwritten Notes |
0%
|
Fail
|
96%
|
| Layout Variance |
High Cost
|
Breaks
|
Layout Agnostic
|
| Setup Time |
0h
|
40h per template
|
Zero-shot
|
| Processing Speed |
5 min/doc
|
2 min/doc
|
<0.8s/doc
|
| Multi-language Support |
Staff dependent
|
Limited
|
40+ languages
|
| Error Rate |
2-5%
|
8-15%
|
<0.5%
|
Benchmarks measured on standardized BOL dataset (n=10,000). Legacy OCR tested with ABBYY FlexiCapture. Neural model: pdf2text-v3-logistics.
Webhooks, retries, and rate limiting handled out of the box. SDKs available for Python, Node, and Go.
# 1. Install the SDK
npm install @pdf2text/sdk
# 2. Extract a document
const pdf2text = require('@pdf2text/sdk');
const data = await pdf2text.extract({
file: './bol_maersk.pdf',
mode: 'ocr_dense'
});
console.log(data.container_id); // "MSCU9872341"
# Extract document via REST API
curl -X POST https://pdf2text.ai/api/v1/documents/upload/ \
-H "Authorization: Bearer $API_KEY" \
-H "Content-Type: multipart/form-data" \
-F "file=@./bol_maersk.pdf" \
-F "mode=ocr_dense"
# Response
{
"container_id": "MSCU9872341",
"confidence": 0.998
}
# 1. Install the SDK
pip install pdf2text-sdk
# 2. Extract a document
from pdf2text import Client
client = Client(api_key="your_api_key")
data = client.extract(
file="./bol_maersk.pdf",
mode="ocr_dense"
)
print(data.container_id) # "MSCU9872341"
Estimate cost reduction based on your document processing volume.
Based on average extraction latency of 1.2s vs human average of 300s. Estimate uses Pro plan pricing (~$0.15/doc).
Upload a messy PDF, image, or scan. No API key required.
Drop PDF BOL here
or click to browse
| Container ID | MSCU9872341 ✓ |
| Type | 40HC |
| Weight | 28,459 KG |
| Shipper | ACME Manufacturing Co. |
Technical details for security review and integration planning.
No. Zero-retention policy. Documents are processed in RAM and flushed immediately after webhook delivery. AES-256 encryption in transit.
View Security Policy →Yes. Docker container deployment for air-gapped environments. Kubernetes helm charts available. Minimum 8GB RAM, GPU optional.
Contact Sales →REST API with webhooks. SDKs for Python, Node.js, Go. Pre-built connectors for SAP, Oracle, NetSuite. ERP sync via Zapier or direct API.
View API Docs →Yes. Neural model trained on 2M+ handwritten samples. 96% accuracy on cursive English. Supports annotations, stamps, and margin notes.
View Accuracy Report →Yes. Fine-tuning available for proprietary formats. Provide 50+ samples and we train a specialized extraction model within 48 hours.
Request Custom Model →99.9% SLA on Enterprise plans. Multi-region failover (US-East, EU-West, APAC). Real-time status page with incident history.
View Status Page →