Skip to content
Metadata & enrichment

Clean metadata from any EPUB, in seconds

Extract and enrich the data that makes a book discoverable — titles, contributors, identifiers, subjects and keywords — normalized to the ONIX for Books 3 standard.

Trusted by publishers worldwide
1,000+ organizations
50+ countries
Powered by the publica.la platform

The data that sells the book

Discoverability starts with metadata, and assembling it by hand is slow and error-prone. Origami reads an EPUB, pulls what is already embedded, and uses AI to fill the gaps — returning a clean record normalized to ONIX for Books 3: title and subtitle, contributors with role codes, publisher, dates, identifiers, language, BISAC and Thema subjects, and retail keywords.

Embedded metadata always wins; the model only fills what is missing and suggests classification and keywords in the language of the book. The result drops straight into your catalogue, ready to feed retailers, aggregators and libraries.

ONIX 3 normalized

Fields map to the ONIX for Books 3 standard — contributors with List 17 roles, identifiers, dates and more.

BISAC & Thema

Suggested subject codes and retail keywords, in the language of the book.

Your data wins

Metadata already embedded in the file always takes precedence over the model’s output.

How it works

1

Upload a file

Drop in an EPUB (or PDF) from your workspace.

2

We read and enrich

Origami reads embedded metadata and uses AI to fill the gaps.

3

Export to catalogue

Review the ONIX-normalized record and feed it to retailers and aggregators.

Questions about metadata extraction

What formats can I extract from?
EPUB files in Origami; the same engine also reads PDF.
What is ONIX?
ONIX for Books is the global standard for communicating book metadata to the trade, maintained by EDItEUR. Normalizing to ONIX means your records are ready for retailers and aggregators.
Does AI overwrite the metadata in my file?
No. Metadata already embedded in the file always wins; AI only fills in what is missing.
Will it suggest BISAC and Thema codes?
Yes, along with retail keywords, in the language of the book.
Is my content used to train AI?
No. Origami’s AI providers do not use your content to train models — see our AI policy for the details.

Metadata is normalized to the ONIX for Books 3 standard maintained by EDItEUR. BISAC is a trademark of BISG; Thema is maintained by EDItEUR.

Clean up your catalogue’s metadata in Origami

Start free and extract metadata from your first title.