Inclusion Engine

Inclusion Engine
Inclusion Engine

What

The Inclusion Engine is a system for publishing data-enriched documentation, normally in the form of websites.

Why

The Inclusion Engine offers the following features & benefits:

Uniformity

Easy roll-out of new template appearances across many webpages in multiple sites.

Data-Enrichment

When building a page impression, data from various graphs in the quadstore will be merged per page.

Pluggability

Because the core is XSLT hosted in DAV, other applications can invoke the same appearance e.g. from VSP scripts. Similarly, a well-designed Inclusion Engine skin can do double-duty as an ODS-Wiki skin.

Philosophy

It's actually just Virtuoso being Virtuoso, with a dash of good organization. Virtuoso provides the web-server, the built-in HTML5-aware version of Tidy, the XSLT engine, the database, the site configuration, the tables for caching...

At all times there is a potential trade-off between what a given skin enforces, allows or leaves up to the user.

Typical Use Case

Your corporate website could consist of these locations in DAV:

  • /DAV/VAD/inclusion-engine/sites/www/content/index.html
  • /DAV/VAD/inclusion-engine/sites/www/content/about-us.html
  • /DAV/VAD/inclusion-engine/sites/www/images/splash.png

By installing Inclusion Engine, these might become the URLs

  • http://www.example.com/
  • http://www.example.com/about-us/
  • http://www.example.com/images/splash.png

respectively.

There would be a virtual directory configured to route all requests through an index.vsp, which maps the requested URL to a path in DAV for the content (note the translation of trailing-/ into .html) and also into a subject URI to perform a SPARQL query for data "about this page" to include as HTML5-microdata or embedded as Turtle or JSON-LD in script tags.

How

Getting Started

Either use the Conductor or run

vad_install('../inclusion-engine_dav.vad');

through an iSQL commandline.

WebDAV Structure

Designate a directory in DAV to use as a base for the site. Within that, create collections content/ and images/.

Create and upload content/index.html; this will become the default page for the site.

Within the Conductor, configure a new virtual host (if necessary) and set the relevant virtual directory (/ in the case of a new vhost); configure it to map all requests to index.vsp and to override execute permissions (typically run as dba).

On this virtual host, add virtual directories:

  • /images pointing to the images/ directory you created, and
  • /skin pointing to the inclusion-engine-global skin collection (/DAV/VAD/inclusion-engine/skin/openlink/)
Opt-Out directories

The above /images and /skin virtual directories are examples of a generic principle referred to as opting-out. Given that Inclusion Engine and its index.vsp are handling the top-level / directory on your site, if you have images, or whitepaper PDF documents, or site-specific Javascript, or executable VSP applications, you should put these in their own virtual directories that do not automatically execute everything via index.vsp.

Webmaster Files

A particularly notable set of locations to opt out of Inclusion Engine includes the URLs:

  • /favicon.ico
  • /robots.txt
  • /sitemap.xml
  • various google-validation html files

Typically we suggest creating a DAV collection called webmaster alongside the normal content and images collections, to house the content of these files; then, a silent internal rewrite rule can be employed to map the public URLs to these as appropriate.

Living with a Skin

The default skin is located under /DAV/VAD/inclusion-engine/skin/openlink/. Within that collection, the structure is:

  • skin/openlink
  • skin/openlink/xslt
  • skin/openlink/css
  • skin/openlink/images
  • skin/openlink/js

and any other relevant collections the skin requires. This is exposed as a virtual directory /skin over HTTP.

The guiding principle here is that an image belongs in $site/images/ if it appears in the content of one or two pages; if it appears on every page as a feature of the skin (e.g. a company logo in the masthead) then it belongs in the skin/images/ directory instead.

As a rule, URLs to links and images should be specified site-relative, e.g. /images/foo.png or /skin/style.css etc., in order that pages at any level within the site should heave equal access to them.

XSLT

By default, the top of the XSLT stylesheet tree is /DAV/VAD/inclusion-engine/skin/openlink/xslt/PostProcess.xslt. This file:

  • includes several other stylesheets implementing smaller modules
  • contains the overall output format specifications
  • contains the HTML skeleton (html, head, body)
  • fleshes-out the skeleton with placeholder calls to titles, metadata, links, microdata, masthead, navigation menu, content block and footer
  • transforms the HTML content from DAV
  • it is best to think of each HTML source file as a set of statement assertions about a page; e.g. if it starts off with <body><h1>Hello.</h1><p>Hello world.</p></body> then that means the final page is to have an h1 element followed by a paragraph. If the skin puts a masthead and footer around it, so be it. If the source has a <style>..</style> block in it, then that will be located to the <head> in the finished page impression, after any style blocks the skin imposes itself in order that the more "local" definitions should win in case of conflict.
  • performs any other tidy-ups required
  • e.g. it might be that links to pages external to your current domain should have a particular CSS class added; that can be implemented by intercepting the a element during XSLT

Flow of Control

There are two logical considerations to be aware of.

  • By default, index.vsp will run source HTML documents through HTML Tidy before XSLT. However, if the source HTML contains a DOCTYPE identifying itself as RDFa, the Tidy step will by bypassed.

  • A skin may choose to force a particular feature on the final impression, for example a left-margin navbar specific to the current site; however, if the source HTML contains an incantation such as <section id="content"> of its own, then it is deemed to have taken control of the page appearance and be given free reign over the entire page width instead.
  • Within this scope, another incantation might also be intercepted, such as <nav id="leftnav"></nav>, and replaced with the current site's navbar but at a different place in the HTML than would otherwise have been the case.

Living with Caching

When invoked via the usual index.vsp method, Inclusion Engine caches the results of its transformations.

If the underlying file in DAV is modified, the cache will be invalidated and the page regenerated automatically.

If the surrounding XSLT skin is modified, you need to run

select incleng..staleall(); select incleng..config_flush_cache();

to flush the cache. Alternatively,

delete from incleng.cache where site='www';

will also restrict the tidyup to just URLs in the given site.

When the inclusion-engine_dav.vad is installed, it will automatically stale the XSLT and empty the cache for you.

Living with Configuration Options

Inclusion Engine's configuration is stored in RDF in the quadstore, in the graph urn:com.openlinksw.virtuoso.incleng.

For administration purposes, it also provides the SQL procedures:

  • incleng..config_set()
  • incleng..config_get()
  • incleng..config_add_site()
  • incleng..config_remove_site()
  • incleng..config_flush_cache()
  • incleng..staleall()

The incleng..config_set() function takes constraints on URL and site as its first parameters. When requesting a configuration parameter, incleng..config_get() first looks for statements about the currently requested URL; if none is found, it looks for statements about the current site; if none is found, it falls back on the configuration parameter's value in a global scope.

This facilitates interesting situations such as:

  • the global skin has a URL url1 to its location in DAV
  • for the www site, we use url2 to another skin
  • the global default is to include data as Turtle-in-script tags
  • but not for the www site
    • but in the case of the /about-us/ page, we do want it again