MLAPDF

Class MLA (Media Library Assistant) PDF extracts legacy and XMP meta data from PDF files

package

Media Library Assistant

since 2.10

Methods

Build an array of indirect object definitions

_build_pdf_indirect_objects( &$string) : void
static

Creates the array of indirect object offsets and lengths

since 2.10

Arguments

$string

Extract dictionary from traditional cross-reference + trailer documents

_extract_pdf_trailer( $file_name,  $file_offset) : mixed
static
since 2.10

Arguments

$file_name

$file_offset

Response

mixed

array of "PDF dictionary arrays", newest first, or NULL on failure

Find the offset, length and contents of an indirect object containing a dictionary

_find_pdf_indirect_dictionary( $file_name,  $object,  $generation,  $instance = NULL) : mixed
static

The function searches the entire file, if necessary, to find the last/most recent copy of the object. This is required because Adobe Acrobat does NOT increment the generation number when it reuses an object.

since 2.10

Arguments

$file_name

$object

$generation

$instance

Response

mixed

NULL on failure else array( 'start' => offset in the file, 'length' => object length, 'content' => dictionary contents )

Parse a PDF dictionary object

_parse_pdf_dictionary( &$source_string,  $offset) : array
static

Returns an array of dictionary contents, classified by object type: boolean, numeric, string, hex (string), indirect (object), name, array, dictionary, stream, and null. The array also has a '/length' element containing the number of bytes occupied by the dictionary in the source string, excluding the enclosing delimiters.

since 2.10

Arguments

$source_string

$offset

Response

array

( '/length' => length, key => array( 'type' => type, 'value' => value ) ) for each dictionary field

Parse a PDF Linearization Parameter Dictionary object

_parse_pdf_LPD_dictionary( &$source_string,  $filesize) : mixed
static

Returns an array of dictionary contents, classified by object type: boolean, numeric, string, hex (string), indirect (object), name, array, dictionary, stream, and null. The array also has a '/length' element containing the number of bytes occupied by the dictionary in the source string, excluding the enclosing delimiters, if passed in.

since 2.10

Arguments

$source_string

$filesize

Response

mixed

array of dictionary objects on success, false on failure

Parse a PDF string object

_parse_pdf_string( &$source_string,  $offset) : array
static

Returns an array with one dictionary entry. The array also has a '/length' element containing the number of bytes occupied by the string in the source string, including the enclosing parentheses.

since 2.10

Arguments

$source_string

$offset

Response

array

( key => array( 'type' => type, 'value' => value, '/length' => length ) ) for the string

Parse a PDF Unicode (16-bit Big Endian) object

_parse_pdf_UTF16BE( &$source_string) : string
static
since 2.10

Arguments

$source_string

Response

string

UTF-8 encoded string

Parse a cross-reference table section into the array of indirect object definitions

_parse_pdf_xref_section( $file_name,  $file_offset) : integer
static

Creates the array of indirect object offsets and lengths

since 2.10

Arguments

$file_name

$file_offset

Response

integer

length of the section

Parse a cross-reference steam into the array of indirect object definitions

_parse_pdf_xref_stream( $file_name,  $file_offset,  $entry_parms_string) : integer
static

Creates the array of indirect object offsets and lengths

since 2.10

Arguments

$file_name

$file_offset

$entry_parms_string

Response

integer

length of the stream

Parse a cross-reference table subsection into the array of indirect object definitions

_parse_pdf_xref_subsection( &$xref_section,  $offset,  $object_id,  $count) : void
static

A cross-reference subsection is a sequence of 20-byte entries, each with offset and generation values.

since 2.10

Arguments

$xref_section

$offset

$object_id

$count

Properties

Array of PDF indirect objects

pdf_indirect_objects : array
static

This array contains all of the indirect object offsets and lengths. The array key is ( object ID * 1000 ) + object generation. The array value is array( number, generation, start, optional /length )

since
var

Type(s)

array