Extract Bills of Lading into JSON/CSV instantly.

Parses container IDs, notify parties, and HS codes from Maersk, MSC, and CMA CGM. Push directly to CargoWise or Flexport.

View API Documentation ->

EXTRACTION_MONITOR v2.1.0

READY

Source_File.pdf

BILL OF LADING

MSC MEDITERRANEAN SHIPPING CO.

B/L Number

MSCU9872341-001

Shipper

ACME MANUFACTURING CO LTD

Container

MSCU 987234-1

Type 40HC Weight 28,459 KG

Consignee

GLOBAL IMPORTS INC

BOL_MSCU9872341.pdf

// Extracted BOL Data
{
  "bl_number": "MSCU9872341-001",
  "container_id": "MSCU9872341" ✓ 99.8%
  "container_valid": true,
  "type": "40HC",
  "weight_kg": 28459,
  "shipper": "ACME Manufacturing Co.",
  "consignee": "Global Imports Inc.",
  "port_load": "CNSHA",
  "port_discharge": "USLGB"
}

bl_number,container_id,type,weight_kg,shipper,consignee
MSCU9872341-001,MSCU9872341,40HC,28459,"ACME Manufacturing","Global Imports"

POST /api/v1/shipments
Authorization: Bearer ••••••••

{
  "system": "cargowise",
  "payload": { ... }
}

→ 201 Created

COMPATIBLE_CARRIERS

MAERSK MSC CMA CGM HAPAG COSCO ONE +40 more

LINE_ITEMS

01 Electronics 1,200 KG

02 Textiles 850 KG

Line Item Extraction

Autodetects running lists of goods, weights, and seal numbers.

VALIDATION

MSCU 123456-7 VALID

Check Digit: OK · ISO 6346

ISO 6346 Validation

We checksum every container ID against ISO standards to prevent errors.

INTEGRATIONS

CargoWise Magaya Flexport

TMS Integration

Webhooks push data directly to your operating system.

PROCESSING_PIPELINE

How It Works

STAGE_01 // INGEST

UPLOAD_QUEUE 3 files

bol_maersk_001.pdf

scan_invoice_44.jpg

attachment_fwd.eml

Multi-Channel Capture

Accepts raw files via API, Email, or SFTP upload.

STAGE_02 // EXTRACT

ocr_engine.log

99.8%

> Detecting tables... OK

> Validating IDs... OK

> Mapping schema... OK

✓ Extraction complete

OCR + Logic Validation

Maps unstructured pixels to structured JSON schemas.

STAGE_03 // SYNC

SAP

SAP S/4HANA

ERP Integration

Live

NetSuite

Cloud ERP

Live

ERP & Webhook Push

Signature Timestamp Status Receiver_Name

ePOD • Signed • GPS Verified

View Schema →

BENCHMARK_MATRIX

Extraction Method Analysis

Comparative performance metrics across document processing approaches.

Metric	Manual Entry	Legacy OCR / Zonal	PDF2TEXT Neural
Handwritten Notes	0%	Fail	96%
Layout Variance	High Cost	Breaks	Layout Agnostic
Setup Time	0h	40h per template	Zero-shot
Processing Speed	5 min/doc	2 min/doc	<0.8s/doc
Multi-language Support	Staff dependent	Limited	40+ languages
Error Rate	2-5%	8-15%	<0.5%

Benchmarks measured on standardized BOL dataset (n=10,000). Legacy OCR tested with ABBYY FlexiCapture. Neural model: pdf2text-v3-logistics.

INTEGRATION_GUIDE

Built for developers, by developers.

Webhooks, retries, and rate limiting handled out of the box. SDKs available for Python, Node, and Go.

Python Node.js Go REST API

install_and_run.sh

# 1. Install the SDK
npm install @pdf2text/sdk

# 2. Extract a document
const pdf2text = require('@pdf2text/sdk');

const data = await pdf2text.extract({
  file: './bol_maersk.pdf',
  mode: 'ocr_dense'
});

console.log(data.container_id); // "MSCU9872341"

# Extract document via REST API
curl -X POST https://pdf2text.ai/api/v1/documents/upload/ \
  -H "Authorization: Bearer $API_KEY" \
  -H "Content-Type: multipart/form-data" \
  -F "file=@./bol_maersk.pdf" \
  -F "mode=ocr_dense"

# Response
{
  "container_id": "MSCU9872341",
  "confidence": 0.998
}

# 1. Install the SDK
pip install pdf2text-sdk

# 2. Extract a document
from pdf2text import Client

client = Client(api_key="your_api_key")
data = client.extract(
    file="./bol_maersk.pdf",
    mode="ocr_dense"
)

print(data.container_id)  # "MSCU9872341"

Trusted by engineering teams at:

Flexport project44 Convoy Shippo

ROI_SIMULATOR

Calculate Your Savings

Estimate cost reduction based on your document processing volume.

VOLUME_INPUTS

Monthly BOL Volume

100 50,000

Clerk Hourly Rate $

$15 $50

Minutes per Document m

2m 15m

PROJECTED_SAVINGS

per month

Hours Saved

Annual Savings

Cost Comparison

Manual Entry $

PDF2TEXT $

Based on average extraction latency of 1.2s vs human average of 300s. Estimate uses Pro plan pricing (~$0.15/doc).

Test with your own data.

Upload a messy PDF, image, or scan. No API key required.

LIVE_DEMO_v2.0

> Uploading... OK

> OCR... OK

> Extracting Entities... ▋

Drop PDF BOL here

or click to browse

Or try one of ours:

> Extraction complete. 12 fields parsed.

Container ID	MSCU9872341 ✓
Type	40HC
Weight	28,459 KG
Shipper	ACME Manufacturing Co.

SSL

24h auto-delete

Max 50MB

SYSTEM_FAQ

Operational Specifications

Technical details for security review and integration planning.

Is data retained?

No. Zero-retention policy. Documents are processed in RAM and flushed immediately after webhook delivery. AES-256 encryption in transit.

View Security Policy →

On-premise available?

Yes. Docker container deployment for air-gapped environments. Kubernetes helm charts available. Minimum 8GB RAM, GPU optional.

Contact Sales →

How do I integrate?

REST API with webhooks. SDKs for Python, Node.js, Go. Pre-built connectors for SAP, Oracle, NetSuite. ERP sync via Zapier or direct API.

View API Docs →

Handwriting support?

Yes. Neural model trained on 2M+ handwritten samples. 96% accuracy on cursive English. Supports annotations, stamps, and margin notes.

View Accuracy Report →

Custom document types?

Yes. Fine-tuning available for proprietary formats. Provide 50+ samples and we train a specialized extraction model within 48 hours.

Request Custom Model →

Uptime guarantee?

99.9% SLA on Enterprise plans. Multi-region failover (US-East, EU-West, APAC). Real-time status page with incident history.

View Status Page →