Internal API¶
The internals of Imprint are implemented in the imprint.core
package.
Some of the internals are exposed to the user through the XML Tag API in
imprint.core.tags
and imprint.core.state
. The remainder is not
normally of interest to the user. However, it may be useful for developers and
authors of more complex plugins to have access to the internals of the engine.
Contents
Parsers¶
imprint.core.parsers
implements the parsers used to process the
XML Template. These parsers make up the heart of the
Engine Layer.
There are currently two parsers: ReferenceProcessor
and
TemplateProcessor
. Both are instances of
haggis.files.xml.SAXLoggable
. The former creates a table of
reference names/titles/locations/numbers that are used by the the latter.
-
class
imprint.core.parsers.
DocxParserBase
¶ Base class that contains common functionality of the XML parsers that make up the Imprint Engine Layer.
This class is only intended to avoid code duplication. It serves no-standalone purpose whatsoever.
The XML structure is encoded in the following attributes:
-
tag_stack
¶ A stack with special methods for entering a tag, exiting a tag, etc, with some structural validation. The current tag is always available via the
current
property. Each tag is pushed as an object containing the tag name, its (edited) attributes, whether or not it expects content and nested tags, and a flag indicating whether or not a warning has been raised for unexpected text if not. If the tag gets a data configuration, that will be referenced as well.
-
-
class
imprint.core.parsers.
ReferenceProcessor
(heading_depth)¶ The SAX parser that is responsible for pre-computing all the relevant references found within the XML template.
Relevant references are any referenceable tags. This processor maintains its own reference counter based on the occurence of <figure>, <table> and other tags within <par> tags with Heading styles.
-
class
imprint.core.parsers.
TemplateProcessor
(keywords, doc, references)¶ A parser to handle the entire document structure with the assumption that a reference mapping has already been made.
It processes all registered tags, generates all the content, replaces all necessary components such as keywords, strings and references.
Much of the processing is handled by the built-in
TagDescriptor
s and theEngineState
. The parser itself performs sanity checking of the XML structure based on the requirements specified in the descriptors. In addition to checking attributes, content and nested tags, it performs a simplistic form of XML validation.The engine state does not get direct access to the data configuration like it does to the keywords. The data configuration is maintained directly by this class:
-
data_config
¶ A
dict
containing all of the data configuration objects (dictionaries) loaded from the appropriate module if keywords contains a'data_config'
key providing the module file name, andNone
otherwise. Only document setups that actually use data configuration need to provide a configuration module.
-
Tag Handling¶
-
class
imprint.core.parsers.
RootTag
¶ Implement the Root tag, regardless of its name.
The root tag is special because any spurious text found within it gets stashed in a special paragraph.
-
class
imprint.core.parsers.
TagStack
¶ A
deque
-based stack that does some basic structural checking of the XML.
-
class
imprint.core.parsers.
TagStackNode
(name, attr, descriptor=None, config=None, open_error=False)¶ A structure for maintaining information about open tags for
TemplateProcessor
.All of the attributes except
warned
are immutable, so while tempting, anamedtuple
can not be used.All attributes are passed to the constructor in the same order that they are listed here. Only the first two are required.
-
name
¶ The name of the tag, not normalized in any way.
-
attr
¶ A plain
dict
containing therequired
andoptional
attributes of the tag. This attribute is mutable and gets passed to both thestart
andend
methods of the tag descriptor. It is not one of the XML library immutable mappings.
-
descriptor
¶ The
TagDescriptor
object for this tag. This must always be an actual instance of the class, not a delegate object to be wrapped. Defaults toNone
.
-
config
¶ The Data Configuration dictionary, if the
descriptor
calls for one,None
otherwise (the default). If the descriptor has adata_config
attribute set but this attribute isNone
, thenopen_error
must be set toTrue
.
-
-
exception
imprint.core.parsers.
OpenTagError
¶ Used as a goto+label marker when processing opening tags.
As per https://stackoverflow.com/a/41768438/2988730 and https://docs.python.org/3/faq/design.html#why-is-there-no-goto
This error is raised to indicate a non-fatal error that prevents the closing tag from being processed.
Utilities¶
imprint.core.utilities
containins general utilities to help
the engine create and process docx files.
The configuration loaders in this module are potentially suitable for inclusion in the haggis library.
-
imprint.core.utilities.
aggressive_strip
(string)¶ Split a string along newlines, strip surrounding whitespace on each line, and recombine with a single space in place of the newlines.
-
imprint.core.utilities.
check_fail_state
(fail)¶ Verify that fail is one of the valid options
{'raise', 'warn', 'ignore'}
.Raise a
ValueError
if it is not.
-
imprint.core.utilities.
trigger_fail_state
(fail, msg, error_class=<class 'ValueError'>, warn_class=<class 'UserWarning'>)¶ React to a failure according to the value of
fail
:'ignore'
: Do nothing'warn'
: Raise a warning with message msg and class warn_class (UserWarning
by default).'raise'
: Raise an error with message msg and class error_class (ValueError
by default).
Any other value of fail triggers a
ValueError
.
-
imprint.core.utilities.
get_handler
(handler_name)¶ Load the named plugin handler.
Handlers are callables that take an object ID and configuration dictionary and generate content for a specific tag like <figure>, <table> or <string>.
If the handler is not found as-is, the
imprint.handlers
package is prefixed to handler_name since that is where all built-in handlers live.
-
imprint.core.utilities.
load_callable
(name, package_prefix=None, magic_module_attribute=<haggis.SentinelType object>, instantiate_class=False)¶ Retrieve an arbitrary callable from a module
The input may be one of six things:
- A module with a magic_module_attribute that contains the callable.
- A callable that implements the correct interface.
- The name of a module containing the magic_module_attribute.
- The name of a callable.
- The name of a module in the package_prefix package.
- The name of a callable in the package_prefix package.
The correct thing is identified as leniently as possible and returned. The returned object is not guaranteed to be the correct thing, just to pass very cursory inspection (e.g., modules must have the magic attribute and any other objects must be callable)
Items 1, 3, 5 are not possible if magic_module_attribute is not specified. Items 5, 6 are not possible if package_prefix is not specified.
This method has one special case. If the object found is a class with a no-arg __init__ method and a __call__ method, an instance rather than the class object is returned. Note that class objects themselves are callable, so if you specify a class without a no-arg __init__ method or without a __call__ method, make sure that __init__ has the signature you require and returns the object that you expect.
Perform a keyword replacement on all valid newstyle format strings in the header and footer XML of a word document.
This operation is currently done by treating the XML as if it was a giant string. The assumption is valid but hacky, since format-like strings delimited by ‘{}’ are unlikely to appear anywhere outside
<w:t>
tags.