Skip to content

PdfDataExtractor

lublak edited this page Oct 18, 2021 · 1 revision

Class: PdfDataExtractor

the extractor for the data of the pdf

Table of contents

Accessors

Methods

Accessors

fingerprint

get fingerprint(): string

get the fingerprint

Returns

string

the fingerprint

Defined in

pdfdataextractor.ts:126


pages

get pages(): number

get the number of pages

Returns

number

the number of pages

Defined in

pdfdataextractor.ts:135

Methods

close

close(): Promise<void>

close the extractor

Returns

Promise<void>

a promise that is resolved when destruction is completed

Defined in

pdfdataextractor.ts:240


getMetadata

getMetadata(): Promise<null | MetadataInfo>

get the metadata

Returns

Promise<null | MetadataInfo>

a promise that is resolved with a {MetadataInfo | null} object with information from the metadata section

Defined in

pdfdataextractor.ts:231


getOutline

getOutline(): Promise<null | Outline[]>

get the outline/bookmarks

Returns

Promise<null | Outline[]>

a promise that is resolved with a {Outline[]} array with information from the tree outline

Defined in

pdfdataextractor.ts:220


getPageData

getPageData(pages?): Promise<(null | PdfPageData)[]>

get the text

Parameters

Name Type
pages? number | number[] | (pageNumber: number) => boolean

Returns

Promise<(null | PdfPageData)[]>

a promise that is resolved with a {string[]} array with the extracted text per page

Defined in

pdfdataextractor.ts:179


getPermissions

getPermissions(): Promise<null | Permissions>

get the permission flags

Returns

Promise<null | Permissions>

a promise that is resolved with a {Permissions | null} object that contains the permission flags for the PDF

Defined in

pdfdataextractor.ts:144


getText

getText(pages?, sort?): Promise<string[]>

get the text

Parameters

Name Type Default value
pages? number | number[] | (pageNumber: number) => boolean undefined
sort boolean | Sort false

Returns

Promise<string[]>

a promise that is resolved with a {string[]} array with the extracted text per page

Defined in

pdfdataextractor.ts:167


get

Static get(data, options?): Promise<PdfDataExtractor>

get the extractor for the data

Parameters

Name Type Description
data Uint8Array the binary data file
options PdfDataExtractorOptions -

Returns

Promise<PdfDataExtractor>

a promise that is resolved with a {PdfDataExtractor} object to pull the extracted data from

Defined in

pdfdataextractor.ts:110