Documents in. Structured, verified data out.

Extraction agents pull entities, clauses, and fields from unstructured documents. Every extraction traces back to the source passage.

Solution

Documents in, structured data out

Classification agents auto-tag documents by type, topic, or taxonomy.
Extraction agents pull entities, clauses, and structured fields.
Every extraction cites the source passage it came from.

Classification that works at scale

Auto-tag documents by type, department, intent, or custom taxonomy
Classification agents run on schedule against new content in your collections
Outputs drive downstream workflows: routing, alerting, or further extraction

Multiple agents with classification and extraction task-type badges

Extraction that cites its sources

Pull named entities, key clauses, dates, amounts, and structured fields from unstructured documents
Every extracted field includes a citation to the source passage
Blueprints for common patterns: contracts, invoices, medical records, regulatory filings

Extracted data with cited source passages for verification

From documents to structured data

Configure agents via the dashboard or API. No custom ML pipelines required.
Chain extraction with classification: first tag, then extract, then validate
Outputs are versioned, auditable, and retrievable via API

Extraction agent configured to pull structured fields from contracts

Key terms, dates, and amounts are extracted automatically — no manual review of every page.
Every extracted field cites the exact clause it came from.
Structured output is available via API for integration with downstream systems.

Example user experience

A legal ops manager queries extracted contract terms

The agent already processed the batch. The manager gets structured data with citations.

Question

What termination clauses are in the contracts uploaded this week?

Extracted results

Three contracts contained termination clauses: Acme Corp (30-day notice, §12.1), Beta Ltd (90-day notice with cure period, §8.3), and Gamma Inc (termination for cause only, §15.2).

Documents processed: 12
Fields extracted: 47
Agent: Contract extraction agent

Implemented with the Knowledge² Python SDK

Keep the implementation surface small

Python SDK Docs Examples

Python SDK example

Python

from sdk import Knowledge2k2 = Knowledge2(api_key="k2_...")# Create an extraction agentagent = k2.create_agent(    name="contract_extractor",    corpus_id="corp_contracts",    system_prompt="Extract key terms, dates, amounts, and termination clauses from contracts. Cite each extraction.",    schedule="on_ingest",)# Query extracted resultsresults = k2.chat(    agent_id=agent["agent_id"],    query="What termination clauses are in this week’s contracts?",)

Illustrative extraction response

JSON

{  "extractions": [    {      "document": "Acme Corp MSA v2",      "field": "termination_clause",      "value": "30-day written notice",      "citation": "§12.1: Either party may terminate with 30 days written notice..."    },    {      "document": "Beta Ltd Services Agreement",      "field": "termination_clause",      "value": "90-day notice with cure period",      "citation": "§8.3: Termination requires 90 days notice and a 30-day cure period..."    }  ]}

Cited evidence on every answer
Tenant-scoped access controls
Audit logging
VPC / on-prem deployment
SOC 2 readiness

Customer results

31.8% cost reduction per turn. 43-75% less retrieval context.

~$80Kmodeled annual savingsElevataFinancial services

Documents in. Structured, verified data out.

Documents in, structured data out

What automated classification and extraction looks like

Classification that works at scale

Extraction that cites its sources

From documents to structured data

An extraction agent processes new contracts and pulls key terms

A legal ops manager queries extracted contract terms

Your first agent is twenty minutes away