Extract Bills of Lading into JSON/CSV instantly.

Parses container IDs, notify parties, and HS codes from Maersk, MSC, and CMA CGM. Push directly to CargoWise or Flexport.

View API Documentation ->
EXTRACTION_MONITOR v2.1.0
READY
Source_File.pdf
BILL OF LADING
MSC MEDITERRANEAN SHIPPING CO.
B/L Number
MSCU9872341-001
Shipper
ACME MANUFACTURING CO LTD
Container
MSCU 987234-1
Type 40HC Weight 28,459 KG
Consignee
GLOBAL IMPORTS INC
BOL_MSCU9872341.pdf
// Extracted BOL Data
{
  "bl_number": "MSCU9872341-001",
  "container_id": "MSCU9872341" ✓ 99.8%
  "container_valid": true,
  "type": "40HC",
  "weight_kg": 28459,
  "shipper": "ACME Manufacturing Co.",
  "consignee": "Global Imports Inc.",
  "port_load": "CNSHA",
  "port_discharge": "USLGB"
}
bl_number,container_id,type,weight_kg,shipper,consignee
MSCU9872341-001,MSCU9872341,40HC,28459,"ACME Manufacturing","Global Imports"
POST /api/v1/shipments
Authorization: Bearer ••••••••

{
  "system": "cargowise",
  "payload": { ... }
}

→ 201 Created
COMPATIBLE_CARRIERS
MAERSK MSC CMA CGM +40 more
LINE_ITEMS
01 Electronics 1,200 KG
02 Textiles 850 KG

Line Item Extraction

Autodetects running lists of goods, weights, and seal numbers.

VALIDATION
MSCU 123456-7 VALID
Check Digit: OK · ISO 6346

ISO 6346 Validation

We checksum every container ID against ISO standards to prevent errors.

INTEGRATIONS
CargoWise Magaya Flexport

TMS Integration

Webhooks push data directly to your operating system.

PROCESSING_PIPELINE

How It Works

STAGE_01 // INGEST
UPLOAD_QUEUE 3 files
bol_maersk_001.pdf
scan_invoice_44.jpg
attachment_fwd.eml

Multi-Channel Capture

Accepts raw files via API, Email, or SFTP upload.

STAGE_02 // EXTRACT
ocr_engine.log
99.8%
> Detecting tables... OK
> Validating IDs... OK
> Mapping schema... OK
✓ Extraction complete

OCR + Logic Validation

Maps unstructured pixels to structured JSON schemas.

STAGE_03 // SYNC
SAP
SAP S/4HANA
ERP Integration
Live
NS
NetSuite
Cloud ERP
Live

ERP & Webhook Push

Delivers clean data directly to your operating system.

SCHEMA_CATALOG

Supported Document Types

Bill of Lading

ACCURACY: 98.5%
EXTRACTED_ENTITIES:
Shipper Consignee Container_ID Gross_Weight Port_Codes
Ocean BOL • House BOL • Sea Waybill
View Schema →

Commercial Invoice

ACCURACY: 97.2%
EXTRACTED_ENTITIES:
Line_Items Unit_Price Incoterms Currency Total_Value
FOB • CIF • EXW • DDP
View Schema →

Packing List

ACCURACY: 96.8%
EXTRACTED_ENTITIES:
Carton_No Dimensions Net_Weight HS_Code CBM
Full Container • LCL • Palletized
View Schema →

Air Waybill

ACCURACY: 95.4%
EXTRACTED_ENTITIES:
MAWB_No HAWB_No Flight_Route Chargeable_Wt
IATA • e-AWB • Consolidated
View Schema →

Customs Declaration

ACCURACY: 94.1%
EXTRACTED_ENTITIES:
HS_Code Duty_Rate Country_Origin CIF_Value
CBP 7501 • ISF • Entry Summary
View Schema →

Delivery Note / POD

ACCURACY: 93.6%
EXTRACTED_ENTITIES:
Signature Timestamp Status Receiver_Name
ePOD • Signed • GPS Verified
View Schema →
BENCHMARK_MATRIX

Extraction Method Analysis

Comparative performance metrics across document processing approaches.

Metric Manual Entry Legacy OCR / Zonal PDF2TEXT Neural
Handwritten Notes
0%
Fail
96%
Layout Variance
High Cost
Breaks
Layout Agnostic
Setup Time
0h
40h per template
Zero-shot
Processing Speed
5 min/doc
2 min/doc
<0.8s/doc
Multi-language Support
Staff dependent
Limited
40+ languages
Error Rate
2-5%
8-15%
<0.5%

Benchmarks measured on standardized BOL dataset (n=10,000). Legacy OCR tested with ABBYY FlexiCapture. Neural model: pdf2text-v3-logistics.

INTEGRATION_GUIDE

Built for developers, by developers.

Webhooks, retries, and rate limiting handled out of the box. SDKs available for Python, Node, and Go.

install_and_run.sh
# 1. Install the SDK
npm install @pdf2text/sdk

# 2. Extract a document
const pdf2text = require('@pdf2text/sdk');

const data = await pdf2text.extract({
  file: './bol_maersk.pdf',
  mode: 'ocr_dense'
});

console.log(data.container_id); // "MSCU9872341"
# Extract document via REST API
curl -X POST https://pdf2text.ai/api/v1/documents/upload/ \
  -H "Authorization: Bearer $API_KEY" \
  -H "Content-Type: multipart/form-data" \
  -F "file=@./bol_maersk.pdf" \
  -F "mode=ocr_dense"

# Response
{
  "container_id": "MSCU9872341",
  "confidence": 0.998
}
# 1. Install the SDK
pip install pdf2text-sdk

# 2. Extract a document
from pdf2text import Client

client = Client(api_key="your_api_key")
data = client.extract(
    file="./bol_maersk.pdf",
    mode="ocr_dense"
)

print(data.container_id)  # "MSCU9872341"
Trusted by engineering teams at:
Flexport project44 Convoy Shippo
ROI_SIMULATOR

Calculate Your Savings

Estimate cost reduction based on your document processing volume.

VOLUME_INPUTS
100 50,000
$
$15 $50
m
2m 15m
PROJECTED_SAVINGS
$
per month
Hours Saved
Annual Savings
Cost Comparison
Manual Entry $
PDF2TEXT $

Based on average extraction latency of 1.2s vs human average of 300s. Estimate uses Pro plan pricing (~$0.15/doc).

Test with your own data.

Upload a messy PDF, image, or scan. No API key required.

LIVE_DEMO_v2.0
> Uploading... OK
> OCR... OK
> Extracting Entities... â–‹

Drop PDF BOL here

or click to browse

Or try one of ours:
> Extraction complete. 12 fields parsed.
Container IDMSCU9872341 ✓
Type40HC
Weight28,459 KG
ShipperACME Manufacturing Co.
SSL
24h auto-delete
Max 50MB
SYSTEM_FAQ

Operational Specifications

Technical details for security review and integration planning.

Is data retained?

No. Zero-retention policy. Documents are processed in RAM and flushed immediately after webhook delivery. AES-256 encryption in transit.

View Security Policy →

On-premise available?

Yes. Docker container deployment for air-gapped environments. Kubernetes helm charts available. Minimum 8GB RAM, GPU optional.

Contact Sales →

How do I integrate?

REST API with webhooks. SDKs for Python, Node.js, Go. Pre-built connectors for SAP, Oracle, NetSuite. ERP sync via Zapier or direct API.

View API Docs →

Handwriting support?

Yes. Neural model trained on 2M+ handwritten samples. 96% accuracy on cursive English. Supports annotations, stamps, and margin notes.

View Accuracy Report →

Custom document types?

Yes. Fine-tuning available for proprietary formats. Provide 50+ samples and we train a specialized extraction model within 48 hours.

Request Custom Model →

Uptime guarantee?

99.9% SLA on Enterprise plans. Multi-region failover (US-East, EU-West, APAC). Real-time status page with incident history.

View Status Page →