Veracity Query Language (VQL)

Veracity Query Language (VQL)

VQL is a language for queries over xAPI statements and how the LRS should process and render them. It underlies the Veracity Learning LRS in several places, including the xAPI and dashboard components. There are also a few features that allow you to supply your own VQL to the system, so learning VQL can help you get the most out of the LRS.

Overview

VQL helps with xAPI statement search, processing, and visualization. It's a JSON-serialized format, meaning that you'll interact with the server by sending JSON-encoded JavaScript objects that adhere to the VQL schema.

The most basic component of VQL is the query. A query has two parts: the filter and the process.
{   "filter": {<a filter object, or null>}, 
"process": [{<an array of processing stages>}]}
A query with no process with return raw xAPI statements. A query with no filter will process all data through the requested process. If you define both a filter and a process, then the filter will filter the data before the process is applied.

A process is a series of steps that transform the data from the previous step (or all the data that passes the filter). Each step in a process is an object with a single property. The name of the property corresponds to the action to take, and the value is an object with the needed settings for the action.
{   "filter": {},
"process": [{
"$frequentValues": {
"path": "actor.id"}}]}
The above example runs the $frequentValues command over all data in the LRS. Data flows from one command in the process to the next, until reaching the end of the process. The result of the query is the data at this point.
Notes
We've collapsed some of the whitespace in the JSON in this article, to make it easier to read.

Where to Use VQL

There are several places you can use VQL in the LRS:
  1. The Chart Builder — in the Analytics section, in any custom dashboard — uses VQL under-the-hood to show you the compiled query.
  2. The Advanced Chart Builder — also available in any custom dashboard.
  3. The Analytics Workshop — in the left menu, under Management > All Management Tools.
  4. Via Saved Scripts — you can use Saved Scripts on access keys or on forwarding rules to filter or transform the data.
  5. Via LTI — you can send a query to the server via LTI. The server will return an HTML page that graphs the response.
  6. Via HTTP — you can POST a VQL query to {your xAPI endpoint}/analyze. Your xAPI access key must have the AdvancedQuery permission.

Filter Commands

A filter is a way to either search for xAPI statements, or to narrow the scope of a process. A basic filter sets conditions for fields on a statement.
{   "filter": {
"actor.id": "mailto:rob@veracity.it"},
"process": [{...}]}
If a filter contains a property name that matches a xAPI statement path and the value of that property is basic data type like a string or number, then the filter selects statements where the path equals the value. The filter above selects statements where the actor.id is mailto:rob@veracity.it. The Veracity Learning LRS computes an actor.id field to be the IRI of the agent. See the xAPI specification for the Actor.

Basic Matching

{   "filter": {
"actor.id": {
"$in": {[
"mailto:one@example.com",
"mailto:two@example.com"]}}},
"process": [{...}]}
If the value of a path is a complex object, then the object must contain a single property from the set of allowed query comparison operators. The value of the property is the parameter to the operator. Valid operators are:
  1. $ne — Not equal.
  2. $gt — Greater than.
  3. $gte — Greater than or equal.
  4. $lt — Less than.
  5. $lte — Less than or equal.
  6. $in — The value in the statement at the path is IN the provided list. The value must be an array.
  7. $nin — Not in some lists. The value must be an array.

Logical Operators

You can use special objects to denote the logical AND and OR concepts. You activate the objects with special properties at the top level of the filter object. If you define multiple properties outside of a logical operator, then the filter ANDs them together.
{   "filter": {
"object.id": "https://moodle.com/example",
"result.completion": true},
"process": [{...}]}
The above filter selects statements where the object.id is https://moodle.com/example AND result.completion is true.
{   "filter": {
"$or": [{
"object.id": "https://moodle.com/example"}, {
"result.completion": true}]},
"process": [{...}]}
The above filter selects statements where the object.id is https://moodle.com/example OR result.completion is true.
{   "filter": {
"$and": [{
"timestamp": {"$lt": 1581907900000}}, {
"timestamp": {"$gt": 1581906900000}}]},
"process": [{...}]}
The above filter selects statement with timestamps between two values. You must use a top-level $and in this case, because the JSON object cannot have the same property name twice.

Pagination

A VQL query can “paginate” results so you can select a subset of the matched statements. This only applies when the process is empty.
  1. $skip — Skips over the specified number of statements and passes the remaining statements to the filter.
  2. $limit — Limits the number of statements passed to the filter.
To skip or limit, include top-level keys in the filter object like so:
{   "filter": {
"$skip": 100,
"$limit": 100},
"process": [{...}]}
The above example returns a second “page” of statements, when a page has 100 statements.
Notes
The $skip plus $limit must be less than the result query window, which is currently defined at 10,000. If your query returns more than 10,000 results, then choose a narrower filter by filtering on timestamp or stored.

Sorting

To sort a query, add a top-level $sort key, give it a property name (for the statement path), and give it a value of either asc (for ascending) or desc (for descending).
{   "filter": {
"$sort: {
"result.score.scaled": "asc"}},
"process": [{...}]}
The above filter returns statements in ascending order by the value of result.score.scaled. If several statements have equal values at the first key, then add a second (and third) key for sorting:
{   "filter": {
"$sort": {
"result.score.scaled": "asc",
"stored": "desc"}},
"process": [{...}]}
The above filter sorts first by result.score.scaled and then on the stored value.

Advanced Filter Modifiers

VQL has several special modifiers that you can add to a filter to change its behavior.
  1. $inherit — Controls whether the filter inherits higher level filters. Defaults to true.
  2. $source — What collection does the filter query? Defaults to statements. Used this to make the entire query operate over a different set of data, like the canonical tables or the xAPI document endpoints. Valid values are:
    1. statements
    2. canonicalAgents
    3. canonicalActivities
    4. canonicalVerbs
    5. state (for activity states)
    6. activityProfile
    7. agentProfile
    8. userLogs
    9. attachments

Processing Commands

VQL processes queries as a series of commands, each inputting a list of documents and outputting a list of documents. The VQL “Planner” software module selects the best database to fulfill the series of commands. Process commands can run in JavaScript, MongoDB, Elasticsearch, or all three in the same query. The VQL Planner arranges operations for the least switching between databases and JavaScript and for the fastest computing of the result. VQL can process some commands using a single technology, while it can run others in several. Regardless, the results are the same. The VQL Planner can freely choose one solution or another. If your Elasticsearch connection weakens, then the VQL Planner will use a “Mongo DB and JavaScript only” strategy. VQL will still compute the same results for the query, but this strategy can significantly affect performance. Because VQL also has a time limit of 10 seconds to complete a query, you need a strong Elasticsearch connection to get practical results from large datasets.

When there is no process, VQL imposes a “sanity” limit of 10,000 results. If VQL can fulfill the query by a strategy that does not pull statements out of a database, this it does not enforce this limit, and considers the entire dataset. The first command in the process takes as its input the list of statements that pass the filter. Usually, VQL can compute this first command in the database, and so it does not enforce the 10,000-statement limit.
Notes
Some versions of VQL must pull statements into JavaScript before processing. In this case, it enforces the 10,000-statement limit. If you think this is happening, then check the output of $explain.

Paths

In VQL commands, a parameter that expects a statement path always ends with the world “path”. It often is the word path. These paths should be the simple JSON notation for a path into a statement, like:
"path": "result.response"

"path": "object.definition.type"
You can also use array notation, such as:
"path": "context.contextActivities.parent[0].id"
For extensions, because the keys contain dots, use bracket notation, like:
"path": "context.extensions ['https://www.test.com/extension'].value"
If a process stage queries the statement database, then it expects a statement path, so do not add a leading $ to the value of the path command:
{   "filter": null,
"process": [{
"$max": {"path": "result.score.raw"}}]}
Since it is in the first stage in the process, the path above matches against all statements, and the $max stage filters on the whole database.
If a process stage has incoming documents, then add the leading $ to the value of the path command (and use dot notation). For instance:
{   "filter": null,
"process": [{
"$paths": {...}}, {
"$max": {"path": "$duration"}}]}
In this case, we use $duration to signal that we want to compute against the incoming document stream passed from the $paths command, rather than from all statements.

Metrics

A VQL process always computes the metric, count. However, many of the most useful commands in VQL can compute many other metrics. In these cases, the process computes the “metric” value from the operator and path that you give. All commands that support metrics support these basic operators:
  1. $min — Find the minimum value of a path.
  2. $max — Find the maximum value of a path.
  3. $sum — Find the sum of all values on a path.
  4. $average — Find the average of all values on a path.
  5. $stdDev — Finds the population standard deviation of all values on a path.
Metrics are named, and all output documents have the name and the corresponding value appended. For instance:
{  "$timeSeries": {
"path": "timestamp"}},
This query generally output a stream of documents in the form:

Output
[{  "_id": 1726099200000, "count": 5}, {
"_id": 1726444800000, "count": 8}, {
"_id": 1726531200000, "count": 11}, {
...}]
If you request an additional metric, you do like this:
{   "$timeSeries": {
"path": "timestamp",
"metrics": {"averageScore": {"$avg": "result.score.raw"}}}},
In the above example, the metrics name is averageScore, the operator is $avg, and the path is result.score.raw. The output document stream will now be:

Output
[{  "_id": 1726099200000, "count": 5, "averageScore": 55.800000000000004}, {
"_id": 1726444800000, "count": 8, "averageScore": 56.666666666666664}, {
"_id": 1726531200000, "count": 11, "averageScore": 57.5}, {
...}]
While you can have several named metrics, each metric can have only one operator. Every operator expects its value to be a path.

The word metrics above is special — every command that supports metrics expects them to be inside this key.

Some commands support additional “special metrics” beyond these. The pattern is the same, but special metrics expect complex objects with many settings, rather than just a path value.

Common Processes

VQL includes a set of commands to make common operations easy:
  1. $frequentValues
  2. $timeSeries
  3. $punchCard
  4. $spread
  5. $paths
  6. $pathsToNodes
  7. $ratio
  8. $count
  9. $expression
  10. $min, $max, $avg, $sum, and $stddev
  11. $keys
  12. $toCanonicalDisplay
  13. $toCanonicalResponse
These commands cover both visualization and data processing; and the default outputs of the common processing commands match seamlessly with certain rendering commands. See Rendering Commands below for more information on rendering visualizations.

Every process command should be a JSON object with one property. This property is the command name, which will always start with a $. The value of the property is an object with the needed command parameters.
{   "filter": null,
"process: [{
"$frequentValues": {
"path": "actor.id"}]}
The above query computes the ten most common actor IDs.

$frequentValues

The $frequentValues command finds the most common distinct values of some path. It supports metrics and sorting by those metrics.
{   "$frequentValues": {
"path": "actor.id",
"limit": 10,
"metrics": <see "metrics">,
"sort": <see "sorting">}},
The $frequentValues command always outputs the count of statements for each unique value, regardless of the metrics configuration.
  1. path — A JSON path into the statement. The unique values at this path counted, and the top ten (or top limit) results are output.
  2. limit — The number of values to select.
  3. metrics — See metrics. $frequentValues also has two Special Metrics options.
  4. sort — See sorting. Selects unique values by some criteria other than the count.
The resulting document set is an array of objects. The _id field of each object is the unique value of the path. The count is the number of times that value occurred.

Output
[{  "_id": <the unique value at path>,
"count": <the count of objects with path===_id>,
"metrics": <metrics output>}, {
...}]

$frequentValues — Special Metrics

In addition to the common metrics, $frequentValues supports two special metrics:
  1. $timeSeries
  2. $frequentValues
If you use either of these metrics, then the command generates a child dataset by splitting all the data that matched each unique path value and running a $timeSeries or $frequentValues query on just that child dataset.
{   "$frequentValues": {
"path": "actor.id",
"metrics": {
"topObjects": {
"$frequentValues": {
"path": "object.id"}}}}},
The above queries compute the most frequent objects for each of the most frequent actors. While this process can compute iteratively, this representation is easier to understand and works better with the Rendering Commands.

Output
[{  "_id": "mailto:musufdik@latiwri.sc",
"count": 264,
"topObjects": [{
"_id": "http://example.com/mini-game",
"count": 84},
...]}, {
...}]
The output is an array with structure like the outer $frequentValues.

If you use these commands normally, then sub- $frequentValues and $timeSeries support all the typical metrics. However, you cannot sort.

$frequentValues — Sorting

By default, the $frequentValues command arranges values in descending order of frequency (desc), but you can override it by adding sort to the metric with the asc criterion.
{   "$frequentValues": {
"path": "actor.id",
"metrics": {"score": {"$avg": "result.score.scaled"}},
"sort": {"score": "asc"}}},
We return the first ten (or limit) results, so sorting in ascending order yields the bottom ten scores.
Notes
The MongoDB limit is 65,535 “open file descriptiors”.
If you want to find the bottom results by count, then you don’t need a metrics command, because a process always computes a count metric:
{   "$frequentValues": {
"path": "actor.id",
"sort": {"count": "asc"}}},
The above process returns the ten least frequent actor IDs.

$timeSeries

The $timeSeries command gathers data into “bins” by day, month, or hour according to the value of the path. The datatype of the values at the path should be timestamp but could also be the value of an extension. The $timeSeries command supports the default count as well as basic metrics.
{   "$timeSeries": {
"path": <the path in the statement to a timestamp>,
"span": <either "daily", "hourly", or "monthly">,
"timezone": <a time zone descriptor for selecting month, and day boundaries. Defaults to "UTC">,
"metrics": <see "metrics">}},
The $timeSeries command has no special metrics. You cannot sort because the process already sorts the data by time.
{   "$timeSeries": {
"path": "timestamp",
"span": "daily",
"metrics": {"score": {"$avg": "result.score.scaled"}}}}},
The command above builds a timeseries for each day, computing the count of statements and average of result.score.scaled on each day.

Output
[{  "_id": 1726099200000, "count": 5, "score": 0.558000000000004}, {
"_id": 1726444800000, "count": 8, "score": 0.566666666666664}, {
"_id": 1726531200000, "count": 11, "score": 0.575}, {
...}]
The $timeSeries command will output a sequence of objects where the _id property is a time in milliseconds, count is the count of statements with a path in that interval, and the metrics specified.

$punchCard

The $punchCard command computes a two-dimensional array over a period: the first dimension is days (of the week or month), and the second is hours (per day). In addition to the default count metric, the command can compute metrics for each point. You'll usually use a $punchCard with a $heatChart, but you can also use more logic to select certain units, for instance, find the busiest day of each month.
{   "$punchCard": {
"path": <the path in the statement to a timestamp>,
"span": <either "weekly", or "monthly">,
"timezone": <a time zone descriptor for selecting month, and day boundaries. Defaults to "UTC">,
"metrics": <see metrics>}},
As a convenience, the command returns a dayOfWeek value as well. The command returns only count for the first outer document, but the command can compute metrics for the second inner one.

Output
[{  "count": 2199,
"day": 1,
"dayOfWeek": "Monday",
"hours": [{
"hour": 0, "count": 48, "score": 0.4707142857142857},
...]}, {
...}]

$spread

Use the $spread command to organize values into a defined number of “bins”. These bins can have metrics. You can use this for generating histograms from a smoothly varying input field.
{   "$spread": {
"path": <the statement path to the input values>,
"min": <the minimum possible value>,
"max": <the maximum possible value>,
"count": <the count of buckets to generate>}},
For instance, a bar chart showing the count of all the scaled scores (which normally run from 0.0 to 1.0) would produce a graph with many bars, each with a single value. Something like this:
    "process": [{
"$frequentValues": {
"path": "result.score.scaled"}}, {
"$barChart": {}}]


This is not a histogram! Instead, this chart shows the relative count of values, not the distribution of values. Sorting by the X-axis label would improve things, but it would still select the top unique scores and plot their frequency. To get a correct view of the data, you must make sure you asked for enough bars so that there was a bar for every unique value in the dataset. On a smooth scale, this could be unlimited! Likewise, the process would count the scores 0.8000001 and 0.8 separately, with a bar for each. A $frequentValues command is not the best way visualize the distribution over a smooth input field.

This query will render a true ten-bar histogram, over the range of scaled scores from 0.0 to 1.0.
{   "process": [{
"$spread": {
"path": "result.score.scaled",
"max": 1.00,
"min": 0,
"count" :30}}, {
"$barChart": {}}]}


Notes
If you get an error mentioning $switch, then that means a value returned by the query fell out of range. If you don't know the range, then use an $inlineSubQuery with a $min or $max command to fill it in.

$paths

The $paths command gathers statements into lists by a unique value and computes metrics over the list. You can consider its raw output as “sessions” and render them in a $ganttChart, or you can examine the path using more logic. The $paths command supports the normal metrics as well as two special metrics: $first and $last. For convenience, it also computes the duration and first and last step in the path. It sorts statements by timestamp, to keep the path in order.
{   "$paths": {
"stepValuePath": <What value is each step? Default: "object.id">,
"sessionPath": <What value is in common? Default: "context.registration">}},
Because xAPI has a rather flexible representation of “session”, you must tell the $paths command what path in a statement has the unique value for each session. While it's possible for an actor to vary across a session, we expect that each statement grouped by the unique values of sessionPath has the same actor. Also, since it's possible for a session to span multiple object.id values, the process doesn't default to this. Consider your data representation before using this feature.

By default, the process assumes that the object.id is the “value of each step in a path”. However, you can override this to be any path in the statement.
{   "$paths": {
"stepValuePath": "result.score.scaled",
"sessionPath": "actor.id"}},
The above configuration of $paths will build an ordered list of each score on any activity, by actor.
{   "$paths": {
"stepValuePath": "context.extensions['http://example.com/module'].id",
"sessionPath": "context.contextActivities.grouping.id"}},
The above configuration of $paths will build an ordered list of each module.id from the context extensions based on a session ID stored in the grouping context, which is typical of xAPI statements using the SCORM Profile.

$paths – Special Metrics

The $paths command supports two special metrics:
  1. $first
  2. $last
If you push statements into a path list, then these metrics pull the first or last value seen for the path.
{   "$paths": {
"stepValuePath": "object.id",
"sessionPath": "context.registration",
"metrics": {
"success": {"$last": "result.complete"},
"actor": {"$last": "actor.id"}}}},
The above configuration will pull paths across objects by registration and find the completion status of the final item. It also pulls the actor's name. Since you can assume that a registration is unique to an actor, either the first or the last actor.id will tell you the actor of the whole session.

Output
[{  "_id": "3a336b33-d520-4602-a674-767225a7025f",
"path": [
"http://example.com/mini-game",
...],
"duration": 1038555,
"first": "http://example.com/mini-game",
"last": "http://example.com/mini-game",
"length": 8,
"start": "2019-06-14T08:51:13.018Z",
"end": "2019-06-14T09:08:31.573Z"}, {
...}]
The above document shows the output from a $path command with no metrics. The fields are:
  1. _id — The unique value of the sessionPath.
  2. path — A list of the values at stepValuePath.
  3. duration — The difference in timestamp from the last to the first event in the path, in milliseconds.
  4. first — The first value in the path list.
  5. last — The last value in the path list.
  6. length — The number of entries in the path list.
  7. start — The timestamp of the first event in the path.
  8. end — The timestamp of the last event in the path.

$pathsToNodes

Notes
About Paths and Nodes
VQL includes some special commands for working with paths through content. The most basic is the $paths command, which can generate ordered lists of steps and then render them with the $ganttChart. However, if you want to visualize the intersections with a $chordChart or the branches with a $sankeyChart, then you need to process the data with $pathToNodes.
The $pathsToNodes command takes the sequence of steps from $paths and builds a network of nodes with connections. Nodes are connected when a path includes them in sequence. The node weight increases each time the pair is seen. Consider these given paths:

Input
[   [A, B, C]
[B, B, C]
[B, C, B, C]]
The network of links will be:
  1. A to B — seen 1 time.
  2. B to C — seen 4 times.
  3. B to B — seen 1 time.
  4. C to B — seen 1 time.
Thus, node A will have one outgoing link (to B), node B will have five (four to C and one to B) and C will have one (to B). In a real-world context, these values A, B, and C might be lesson identifiers. Such a graph of nodes lets you ask questions like “What lesson do students most frequently take after the Earth Science lesson?”. If you combine it with subqueries and filters, then you can ask, “What lesson do high achieving students most frequently take after the Earth Science lesson, versus their low achieving peers?”
{   "$pathsToNodes": {
"order": <Boolean>,
"nodeIdPath": <Path that uniquely identifies the node>,
"nodeTitlePath": <Path that uniquely identifies the node>}},
The defaults match the output of $paths. The nodeIdPath and nodeTitlePath both work over the values of the $.path array in each incoming document. By default, these documents are simple strings, so you can ignore nodeIdPath and nodeTitlePath.

The order key tells the $pathsToNodes command whether to consider nodes at distinct positions in the path as unique. The above example (with A, B, and C) treats order as false. If order were true, then there would be one A node, three B nodes and three C nodes. Because A is always the first, there is only one. B can be the first, second or third, so there are three. Likewise, C can be the second, third or fourth, so there are three.

Imagine a $paths command that outputs the following:

Output
[{  "_id": "46713ed1-60f3-45a2-a45c-eaefbc3ecd5f",
"path": [
"http://example.com/simple-example",
"http://example.com/mini-game"],
"duration": 891565,
"first": "http://example.com/simple-example",
"last": "http://example.com/mini-game",
"length": 2,
"start": "2019-06-18T11:58:12.610Z",
"end": "2019-06-18T12:13:04.175Z",
"success": null}, {
"_id": "e58d3c54-74ca-405e-8fae-845f19479c64",
"path": [
"http://example.com/simple-example",
"http://example.com/mini-game",
"http://example.com/mini-game",
"http://example.com/simple-example"],
"duration": 2747275,
"first": "http://example.com/simple-example",
"last": "http://example.com/simple-example",
"length": 4,
"start": "2019-06-18T11:48:56.360Z",
"end": "2019-06-18T12:34:43.635Z",
"success": null}, {
...}]
From the same statements, the $pathsToNodes command would output a series of nodes for each object.id in the $.path:

Output
[{  "in": 5,
"id": "http://example.com/simple-example",
"count": 9,
"objectId": "http://example.com/simple-example",
"title": "http://example.com/simple-example",
"out": {
"http://example.com/mini-game": 2,
"http://example.com/simple-example": 2,
"http://example.com/standard-scenario": 1},
"step": "0",
"totalOut": 5}, {
...}]
Above, is a single node from the document results stream named http://example.com/simple-example, with three outputs, each to one of the other nodes. The objectId names the original value of the object on that path that created the node. The outgoing link describes the other node objectId, not the id. This is because the nodeIdPath can overwrite the value of id. This object also has total in and total out numbers. You can subtract these to find a falloff rate.

You can render such a network of nodes directly with a $chordChart command.

$ratio

The $ratio command is shorthand for computing relative number of statements matching two queries.
{   "$ratio": {
"of": <filter>,
"over": <filter>}},
This command runs two queries in parallel and compute their relative counts. The command expands these filters into subqueries, which follow the filter inheritance rules. Make sure you use the of and over properties exactly as described in the Filter section.

The $ratio command does not support metrics.
{   "$ratio": {
"of": {
"verb.id": "http://adlnet.gov/expapi/verbs/scored",
"result.score.scaled": {"$gt": 0.7}},
"over": {
"verb.id": "http://adlnet.gov/expapi/verbs/launched"}}},
The above example computes the ratio of statements where a verb matches scored and the score is greater than 0.7 over the number of total launches.

Output
[{  "ratio": 0.2727272727272727}]
The $ratio command will output a single document with one key. This key is always ratio and is the result of the count of statements matching the first filter (of) to those matching the second (over).

$count

The $count command counts the number of documents in the stream, or the number of statements that match the filter. The only configuration is the name of the output value.
{   "$count": "outputKey"},
Output
[{  "outputKey": 264}]
If this command is the first command in the processing pipeline, then it matches against statements themselves: the command does this very efficiently. If it's not the first, then the command counts the documents in the stream.

$expression

Use the $expression command to compute an expression. The $expression command destroys the input stream, which means can only use it if you know the expression beforehand. You can use an $inlineSubQuery to form a complex expression. The $expression command always outputs an array with a single document.
{{ "$expression": {
"$subtract": [1, 3]}}},
The above process results in the following output document array:

Output
[{  "value": -2}]
Of course, this is not especially useful. Use this stage with $inlineSubQuery instead. Here's a better example:
{{ "$expression": {
"$subtract": [{
"$inlineSubQuery": {
"filter": {"verb.id": "http://adlnet.gov/expapi/verbs/passed"},
"process": [{"$count": "count"}],
"select": {"mode": "one", "path": "$[0].count"}}}, {
"$inlineSubQuery": {
"filter": {"verb.id": "http://adlnet.gov/expapi/verbs/failed"},
"process": [{"$count": "count"}],
"select": {"mode": "one", "path": "$[0].count"}}}]}}},
This finds two counts of different verbs and subtracts one from the other. If you could peek inside the VQL Planner, then you’d see:
{   "type": "ExpressionStage",
"args": {
"expr": {"$subtract": [1514, 364]},
"lrs": "102d1251-0d5c-491e-b87b-250061a9d7f8"}}
The $inlineSubQuery turns this into a static computation.

$min, $max, $avg, $sum, and $stdDev

Just like you can use them as metric operators for other commands, you can also use these as standalone commands in a process to find the minimum, maximum, average, sum, or standard deviation of values. Each takes only a path setting, which identifies the location in the statement or document to process.
{   "$min": {"path": <a statement path>}},
Output
[{  "value": 264}]
If these commands are the first command in the processing pipeline, then they process the statements themselves. The command can do this very efficiently. If not the first, then they process the documents in the stream.

While not obviously powerful on their own, these commands can be used in combination with a filter to ask, “What is the average score on this exam?”, or “Of all their attempts, what is the longest a student ever spent in the module?”

$keys

The $keys command returns information about the structure of data. It takes a single parameter (path) and returns an array of the possible child document keys and their data types.
{   "$keys": {"path": "result"}},
Output
[{  "_id": "duration", "type": "string"}, {
"_id": "extensions", "type": "object"}, {
"_id": "score", "type": "object"}, {
"_id": "success", "type": "boolean"}]
Use this command to inspect the structure of unknown data, especially in extensions, where the xAPI schema gives no information.

$toCanonicalDisplay

Use the $toCanonicalDisplay command to map some agent, activity, or verb to its “canonical” representation. Use this command if your dataset does not include the pretty name or description in every verb. You can join to that data with the command and only send a simple verb.id with most statements.
Notes
Use this command as late in your process as you can, because it requires exiting the database and continuing processing in JavaScript. In some cases, this can have a massive performance penalty, so use this only right before a rendering command.
{   "$toCanonicalDisplay": {
"path": <JSON path to an identifier. Default: "_id">,
"outkey": <Property name to append. Default: "canonical">,
"mode": <Either "any", "activity", "agent", or "verb">}},
In many cases, using only the defaults will accomplish the goal. You can increase efficiently by letting the system know if you expect the value to represent an agent, activity, or verb. By default, it will query all three tables and return the first match.

For example, given an input stream of documents like this:

Input
[{  "_id": "mailto:musufdik@latiwri.sc", "count": 264}, {
...}]

This command…
{   "$toCanonicalDisplay": {}},
… results in a stream of documents like this:

Output
[{  "_id: "mailto:musufdik@latiwri.sc",
"count": 264,
"canonical": {
"_id": "5d08fc24aa70bd4ba10d2970",
"id": "mailto:musufdik@latiwri.sc",
"account_homepage": null,
"account_name": null,
"count": 264,
"created": "2019-06-18T14:58:44.448Z",
"display": "Amelia Duval",
"isActor": true,
"mbox": "mailto:musufdik@latiwri.sc",
"mbox_sha1sum": null,
"name": "Amelia Duval",
"updated": "2019-06-18T14:59:17.104Z"}}, {
...}]
You can also use $toCanonicalDisplay to map sub-arrays. If the path has a *, the command will compute the value for each object in the embedded array. For example, when given the document stream:

Output
[{  "_id": "Amelia Duval",
"objects": [{
"_id": "http://example.com/mini-game", "count": 84}, {
"_id": "http://example.com/simple-example", "count": 80}, {
...]},
...}]
The following command…
{   "$toCanonicalDisplay": {
"path": "objects[*]._id"}},
… produces the following output:

Output
[{  "_id": "Amelia Duval",
"objects": [{
"_id": "http://example.com/mini-game", "count": 84, "canonical": {...}}, {
"_id": "http://example.com/simple-example", "count": 80, "canonical": {...}}, {
"_id": "http://example.com/standard-scenario", "count": 52, "canonical": {...}}, {
"_id": "http://example.com/cdiff-scenario", "count": 48, "canonical": {...}}]}, {
...}]
Each subdocument in the embedded objects array has had its _id mapped to it canonical representation. This is particularly useful when rendering a chart that can process embedded arrays.

$toCanonicalResponse

The $toCanonicalResponse command works very much like the $toCanonicalDisplay command, but you use it to map a response ID to the response display text from table of canonical values.
{   "$toCanonicalResponse": {
"path": <Path to an identifier. Default: "_id">,
"outkey": <Property to append the canonical data. Default: "canonical">,
"objectId": <Hard-coded single activity ID from the canonical tables>,
"objectIdPath": <Path of the proper activity, per input document>}}
If the incoming document stream has responses from multiple different activities, then you would use objectIdPath.

MongoDB Commands

Many VQL commands operate as a special “stage” in a MongoDB aggregation pipeline of data from your xAPI statement dataset. As such, VQL also supports some other MongoDB stages.

$addFields

Use the $addFields command to add a field to a document. The process computes this field using the specified expressions.

$project

The $project command lets you restructure a document by computing new fields from old ones.

$group

The $group command separates documents according to a key: usually, _id. You set any other fields in the output using expressions.
Notes
A $group stage does not order its output documents.

$unwind

The $unwind command deconstructs an array to a document for each element in the array. Each document replaces the value of the array field with the element.

$match

Use the $match command as an ad hoc filter to select only those documents that match a specified condition.

$switch

The $switch command checks a series of conditional expressions. If it finds a true expression, then it executes another expression and exits.

$sort

Use the $sort command to ad hoc sort documents in a particular order.


Flow Control Commands

VQL includes a few basic commands for iteration, parallel processing, and subqueries. Additionally, it has some special symbols that dictate how it passes data around within complex queries.

$query

Think of $query as a “subquery” instead, which you can use to “go back to the database” to compute more values. A $query command follows a set of rules for deciding how it interacts with its parent query.
  1. A $query inherits filters from the parent (by default, unless $inherit is false).
  2. A $query is not affected by the filters of its siblings.
  3. Never use a $query to query the incoming document stream. Use $match instead.
  4. The output of a $query is an array of documents (like the parent query), which replaces the parent result set completely, avoiding nested arrays.
  5. The result of a $query overwrites the previous data stream (as do all commands).
  6. VQL evaluates any $input, $context, and $each commands beforehand, when a $query command executes.
{   "filter": null,
"process": [{
"$query": {
"filter": {
"result.score.scaled": 1}}}]}
The above $query will simply return a list of statements where result.score.scaled is 1.
{   "filter": null,
"process": [{
"$query": {
"filter": {
"result.score.scaled": 1}}}, {
"$query": {
"filter": {
"result.score.scaled": 0}}}]}
In the example above, the second $query destroys the results of the first. This is a significant waste of computational resources, and you should avoid it.
{   "filter": {
"actor.id": "mailto:rob@veracity.it"},
"process": [{
"$query": {
"filter": {
"result.score.scaled": 1}}}]}
In the above example, the $query finds statements where the actor.id is mailto:rob@veracity.it AND the scaled score is 1. This is because the $query inherits the parent filters
{   "filter": {
"actor.id": "mailto:rob@veracity.it"},
"process": [{
"$query": {
"filter": {
"$inherit": false,
"result.score.scaled": 1}}}]}
In the above example, the $query finds statements where the scaled score is 1, regardless of the parent query actor.id filter. This is because $inherit is false.
{   "filter": {
"actor.id": "mailto:rob@veracity.it"},
"process": [{
"$count": "count"}, {
"$query": {
"filter": {
"result.score.scaled": 1}}}]}
Similarly, the value of $count above is lost. What then is the purpose of a $query at all? Read on…

$parallel

The $parallel command lets you run multiple $query at the same time. The results of each $query is an element in the output array, which means you'll end up with nested arrays.
{   "filter": {
"actor.id": "mailto:musufdik@latiwri.sc"},
"process": [{
"$parallel": {
"stages": [{
"$query": {
"filter": {},
"process": [{
"$min": {"path": "result.score.scaled"}}]}}, {
"$query": {
"filter": {},
"process": [{
"$max": {"path": "result.score.scaled"}}]}}]}}]}
The above query results in an array of two datasets. Since each stage in the $parallel is a query, it outputs an array. Because the $min and $max stages always output one document, the results will be:

Output
[[{ "value": 0.009999999776482582}], [{
"value": 1}]]
The $query stages inherit the parent filters. In all circumstances, the child stages should be $query.
{   "filter": {
"actor.id": "mailto:musufdik@latiwri.sc"},
"process": [{
"$parallel": {
"stages": [{
"$min": {"path": "result.score.scaled"}}, {
"$max": {"path": "result.score.scaled"}}]}}]}
The above structure is not currently supported. The $min and $max stages would compute over the whole dataset because they are not “query consuming” commands and are nested in such a way that the VQL Planner cannot plan their filters. The VQL Planner will instead select 1,000 results that match the filter, move them into JavaScript, and compute the minimum and maximum there. This could be right, but you'll never know if there were more data to consider or not. VQL may support this construction in the future.
Notes
Several patterns cause the VQL Planner to retrieve data from the database and continue processing the query in JavaScript. If this happens, then it confines the first dataset to some maximum number of results, 1,000 currently. In general, query patterns that compute in this mode are inefficient and can be inaccurate. You should strive to form queries that VQL can compute mostly in the database. Check the output of $explain for query execution data.

$mergeArrays

In the $parallel command example, the result of a query was a set of nested arrays. The $mergeArrays command concatenates these arrays and outputs a single array. The query…
{   "filter": {
"process": [{
"$parallel": {
"stages": [...]}}, {
"$mergeArrays": {}}]}
… will produce…

Output
[{  "value": 0}, {
"value": 1}]
… rather than…
[[{ "value": 0}], [{
"value": 1}]]
The $mergeArrays command can make continued processing much simpler. For instance, you might find the highest score for several students and then average those scores.
{   "filter": {},
"process": [{
"$parallel": {
"stages": [...]}}, {
"$mergeArrays": {}}, {
"$avg": {"path": "$value"}}]}

$explain

The $explain command replaces the result set with a lengthy report on what the query engine did. Because you can use different ways to describe the same logic, you may wish to compare speeds or debug your logic. This command lets you see stage timings and the results at each step in the computation.

The $explain command currently accepts no settings. Any value will work, although in the future, we may allow select certain types of data in the settings.
{   "$explain": true},

Language Features

VQL offers some extra JSON objects for various special uses.

$inlineSubQuery

The $inlineSubQuery command is a preliminary subquery. You can use it to make a query and feed the results into the formation of a higher-level subquery. This lets you use more expressive syntax than using a combination of a $query stage and the $input keyword, although the results are similar.
{   "filter": {
"actor.id": {
"$in": {
"$inlineSubQuery": {
"select": {"path": "$[*]._id", "mode": "all"},
"filter": {},
"process": [{
"$frequentValues": {
"path": "actor.id"}}]}}}},
"process": [{...}]}
The above example finds the ten most frequent actors and gets statements with those actors, without processing them. The process shows the computed result of the subquery. If you could peek inside the VQL Planner, then the filter for the query would look like this:
    "filter": {
"actor.id": {
"$in": [
"6e4c2cd96ae7366434fcde2990b9c4aecd194235",
"237a069230bd63a33a199308c7051ea8b3782fb0",
"aec226e629317cd1dff60fda5913450ffba304ee",
"8811f8c8fde39e96233499dd99a2dffa42bf9106",
"a986c5de80ec830429649599d5547fc574c6e39e",
"e979eabe1cee325ec09e0cc98b2c93c18045726a",
"41595cc08504104e89e35c61ba58ed00b3a038c8",
"80cd17a320ecb1df3fc7f475ee83e2f88dc76e5d",
"fedf814142928dbf6047c4ece8058fa9990b17bc",
"e36cfe9d632c8fa2cc499445d1d2420f0f674269"]}},
Where the $in query operator has an array, which is the output from the $inlineSubQuery. As in the select field in the example above, you may need to extract some value from the subquery result:
  1. select.path — A JSON path expression. Can use array notation like $[0].something.
  2. select.mode — Either one or all. Default: one. If set to one, then the process returns the first value matching the path. If set to all, then the process places all matching values into an array.
So, to find statements by the single most frequent agent, you would form the $inlineSubQuery as:
{   "filter": {
"actor.id": {
"$inlineSubQuery": {
"select": {"path": "$[0]._id", "mode": "one"},
...}}},
"process": [{...}]}
The computed filter in the VQL Planner for the query then becomes:
   "filter": {
"actor.id": "6e4c2cd96ae7366434fcde2990b9c4aecd194235"},

$input

The $input command lets you reference the values that flow into a query, if that query is not the first in a process. The $input is evaluated at beforehand, which for a $query command comes when that $query is executed. This means that you can make a query that depends on the values of an earlier query!
{   "filter": {},
"process": [{
"$query": {
"filter": {},
"process": [{
"$avg": {"path": "result.score.scaled"}}]}}, {
"$query": {
"filter": {
"result.score.scaled": {"$gt": {"$input": {"path": "$[0].value"}}}},
"process": [{
"$frequentValues": {
"path": "actor.id"}}]}}]}
Consider the above example. It uses two $query in sequence, feeding the results of one process into the query of another. The first $query, run alone, returns a dataset like this:

Output
[{  "value": 0.5000182093987425}]
The value is the average score of all statements because the filter is blank. Because $input is resolved at execution time:
"$input": {"path": "$[0].value"}
… the output becomes…
0.5000182093987425
Therefore, the filter for the second $query becomes…
"result.score.scaled": {"$gt": 0.5000182093987425}
Since there is no query at the top level, the second $query will filter the full dataset. This can be summed up as “Find the average, and then find the people who most frequently score higher.”
{   "$input": {
"path": <a JSON path>,
"mode": <"all" or "one">}},
  1. path — A JSON path expression. Can use array notation, like $[0].something.
  2. mode — Either one or all. Default: one. If set to one, then the process returns the first value matching the path. If set to all, then the process places all matching values into an array.

$each

The $each command makes it simpler to execute multiple filters in parallel. If filter has $each in it, then the process automatically “unwinds” the filter to a set of parallel $query for each different value. For instance:
{   "filter": {
"$each": [{
"actor.id": "rob@veracity.it", "activity.id": "https://www.example.com/module"}, {
"actor.id": "tom@veracity.it", "activity.id": "https://www.example.com/module"}, {
"actor.id": "jason@veracity.it", "activity.id": "https://www.example.com/module"}]},
"process": [{
"$frequentValues": {
"path": "result.response"}}]}
… will internally expand into…
{   "filter": {},
"process": [{
"$parallel": {
"stages": [{
"$query": {
"filter": {
"actor.id": "rob@veracity.it",
"activity.id": "https://www.example.com/module"},
"process": [{
"$frequentValues": {
"path": "result.response"}}]}}, {
"$query": {
"filter": {
"actor.id": "tom@veracity.it",
"activity.id": "https://www.example.com/module"},
"process": [{
"$frequentValues": {
"path": "result.response"}}]}}, {
...}]}}]}
You can express the same logic even more concisely by inverting the location of the $each:
{   "filter": {
"actor.id": {
"$each": ["rob@veracity.it", "tom@veracity.it", "jason@veracity.it"]},
"activity.id": "https://www.example.com/module"},
"process": [{
"$frequentValues": {
"path": "result.response"}}]}
The above example shows how you can use $each to process a list of IDs.

You can merge $each with $input and $query to find some list, and then repeat a process for each value in that list. For instance:
{   "filter": null,
"process": [{
"$query": {
"filter": {
"activity.id": "https://www.example.com/module"},
"process": [{
"$frequentValues": {
"path": "actor.id"}}]}}, {
"$query": {
"filter": {
"actor.id": {
"$each": {
"$input": {"path": "$[*]._id", "mode": "all"}}},
"activity.id": "https://www.example.com/module"},
"process": [{
"$frequentValues": {
"path": "result.response"}}]}}]}
The above example finds the most frequent actors on a module, and then, for each of them, finds their most frequent responses. You could also compute this example with a $frequentValues and a sub- $frequentValues metric.

$context

Sometimes, you need to know some other value to build the query but can only know that value after you send the query. Use a $context stage to replace part of the query with such a value before the query executes. The parameters are the same as $input, and the logic works the same way — you select one or all results of a JSON path and replace the $context call with those values. The path is evaluated against the “context object”, which varies with different applications of VQL.
{   "$context": {
"path": <a JSON path>,
"mode": <"all" or "one">}},
A $context stage always includes the current filter and process of the current query. If you execute a $query, then the process generates a new “context object” and makes the context of the parent query available as $parent.xxx. So, in a $query, to access the top-level filter, you can write:
{   "$context": {
"path": "$parent.filter"['actor.id']"}}
If you use the automatic parallel queries generated by $each, then you can get the value of the currently-executing branch to use for a title in a rendering command.
{   "filter": null,
"process": [{
"$query": {
"filter": {
"$each': [{
"actor.id": "mailto:musufdik@latiwri.sc"}, {
"actor.id": "mailto:ece@pi.si"}, {
"actor.id": "mailto:letir@wimameda.tn"}]},
"process": [{
"$frequentValues": {
"path": "object.id"}}, {
"$barChart": {
"categoryPath": "canonical.display",
"seriesTitle": {"$context": {"path": "$.filter['actor.id']"}}}}]}}]}
The above example uses $context to give a series title to each bar chart generated by $each.

$def

The $def command lets you extract a bit of VQL and reuse it in many places in a query. VQL stores “defined” values computed in a defs stage in the top-level query object or passed them down into the $def command wherever it's used.
{   "filter": null,
"defs": {
"min": 0,
"max": 100},
"process": [{
"$expression": {
"min": {"$def": "min"},
"max": {"$def": "max"}}}]}
Results in:

Output
[{  "value": {"min": 0, "max": 100}}]
Of course, this could be much more interesting!
{   "filter": null,
"defs": {
"min": {
"$inlineSubQuery": {
"filter": null,
"process": [{"$max": {"path": "result.score.scaled"}}],
"select": {"mode: "one", "path": "$[0].value"}}}, {
"max": {
"$inlineSubQuery: {
"filter": null,
"process": [{"$min: {"path": "result.score.scaled"}}],
"select": {"mode": "one", "path": "$[0].value"}}}},
"process": [{
"$expression": {
"min": {"$def": "min"},
"max": {"$def": "max"}}}]}
The defs stage in the above query has two $inlineSubQuery commands. However, we've not really saved much buy using $def, since we use each definition only once. This doesn't decrease the amount of code.

We can simplify further by parameterizing the defs like so:
{   "filter": null,
"defs": {
"score": {"path": "result.score.scaled"},
"minOrMax": {
"$inlineSubQuery": {
"filter": null,
"process": [{"$def": "param"}],
"select": {"mode": "one", "path": "$[0].value"}}}},
"process": [{
"$expression": {
"min": {"$def": {"path": "minOrMax", "defs": {"param": {"$min": {"$def": "score"}}}}},
"max": {"$def": {"path": "minOrMax", "defs": {"param": {"$max": {"$def": "score"}}}}}}}]}
Here, the minOrMax definition has another $def inside it. From where are these filled? From the values in the defs field of the $def command. This is a nested function call. We call the $def that defines a subquery and pass new defs to resolve the full inlineSubQuery. Just to make things more amazing, the value you pass into minOrMax refers to the score definition. You can imagine <score like a simple function call that takes no parameters and returns a literal string.

Rendering Commands

One major goal of VQL is to harmonize visualization and data processing. To achieve that goal, VQL includes commands that translate data into a chart configuration object. This object is then fed to the process to produce the chart. VQL, builds charts using amCharts, so the rendering commands output an amCharts-compatible chart description.

Rendering commands are simply another command in a process. They process the incoming document stream according to their own settings. Some rendering commands change the incoming stream to make it compatible with VQL.

Several rendering commands also support multiple series (grouped and merged charts) or split series (stacked charts), described later in this section

$barChart

The $barChart command generates one bar for each incoming document in the stream. The height of the bar — the “value” — is one property in the document. The $barChart defaults align with the $frequentValues stage, so you can process a basic $frequentValues with no configuration:
{   "$barChart": {
"categoryPath": "_id",
"valuePath": "count",
"stackCategory": "_id",
"stackTitle": "_id",
"titlePath": "_id",
"seriesTitle": <a string>}}
  1. categoryPath — This is the field to use to identify each bar. Default: _id.
  2. valuePath — This is the field to use as the height of the bar. Default: count.
  3. titlePath — Use this path to select the title of each split in a split series. Default: _id.
  4. stackCategory — If the categoryPath selects a child array in a split series, then use this to select the splits. Default: _id.
  5. stackTitle — If the categoryPath selects a child array in a split series, then use this to select the title of each split. Default: _id.
  6. seriesTitle — A title for this series, which only appears in charts with multiple series.
{   "filter": {},
"process": [{
"$frequentValues": {
"path": "actor.id"}}, {
"$barChart": {}}]}
Because the defaults are aligned, you can render a $frequentValues without any configuration. The above query generates the following render:

Here, you can see that the _id field of the path became the bar title, and the count field became the bar height.
{   "filter": {},
"process": [{
"$frequentValues": {
"path": "actor.id"}}, {
"$toCanonicalDisplay": {}}, {
"$barChart": {
"categoryPath": "canonical.display"}}]}
Above, the $toCanonicalDisplay command adds a property to each document — canonical.display. The $barChart uses this, rather than the _id field, to label the bars.

{   "filter": {},
"process": [{
"$frequentValues": {
"path": "actor.id",
"metrics": {"avgScore": {"$avg": "result.score.scaled"}}}}, {
"$barChart": {
"valuePath": "avgScore"}}]}
Above, the query computes a metric, which adds a property called avgScore to each document in the stream. Then, the $barchart stage sets the height of the bar to the value of the avgScore field.

The above query computes the average score of the most frequent ten actors, not the ten top average scores. See $frequentValues.

$pieChart

The $pieChart command is almost identical logically to the $barChart. It generates one slice for each document in the stream.

$serialChart

Use the $serialChart command to render a $timeSeries stage. It works in much the same way, with slightly different parameters. The defaults are aligned with the $timeSeries command, so you can produce an “Activity over Time” chart very easily.
{   "filter": {},
"process": [{
"$timeSeries": {}}, {
"$serialChart": {}}]}
The above query creates a simple line chart of all traffic over time. This is because the default valuePath is count, which is always included in a $timeSeries.


  1. categoryPath — Use this field to select timestamps. Default: _id.
  2. valuePath — Use this for the height of the line at each timestamp. Default: count.
  3. lineCategory — If the cateogryPath is a child array in a split series, then use this to select the lines. Default: _id.
  4. lineTitle — If the cateogryPath is a child array in a split series, then use this to name the line. Default: _id.
  5. seriesTitle — A title for this series. This stage only shows this when merging charts in a multiple series.

$notice

A $notice widget is a large, centered line of text with an icon, and a subtitle section below. A $notice widget only shows data from the 0th document in the incoming stream.
  1. title — The text to display.
  2. color — The color of the icon. Default: #407188.
  3. subtitle — The subtitle text.
  4. icon — The icon class. Veracity uses Font Awesome icons. You do not need to include the prefix, fa-. Default: check.
  5. valuePath — A JSON path for selecting what object to feed into the string interpolation algorithm. Defaults to the 0th document in the stream.
  6. precision — Selects how many decimal places to use for formatting numbers. Default: 2.
{   "$notice": {
"title": "Demo Title",
"subtitle": "The subtext goes here",
"icon": "check"}}
The command above produces the image below.

You may notice that the incoming data doesn't change this render at all!

String Interpolation
The $notice and $list widgets accept template literal strings, into which you can embed expressions that select and change the values to display.
{   "$notice": {
"title": "${value.canonical.display}",
"subtitle": "${value._id} was launched ${value.count} times",
"icon": "check",
"valuePath": "$"}}
The example above gets values from the 0th document in the stream. You can add logic upstream from the rendering command to customize the icons and colors as well.
{   "title": "Top Agents",
"filter": {},
"process": [{
"$frequentValues": {
"path": "actor.id",
"metrics": {"avgScore": {"$avg": "result.score.scaled"}},
"limit": 7}}, {
"$skip": 1}, {
"$toCanonicalDisplay": {}}, {
"$project": {
"avgScore": "$avgScore",
"count": "$count",
"canonical": "$canonical",
"icon": {
"$switch": {
"branches": [{
"case": {"$gt": ["$avgScore", 0.49]}, "then": "check"}],
"default": "close"}},
"color": {
"$switch": {
"branches": [{
"case": {"$gt": ["$avgScore", 0.49]}, "then": "green"}],
"default": "red"}}}}, {
"$notice": {
"title": "${value.canonical.display}",
"subtitle": "${value._id} scored ${value.avgScore}%",
"icon": "${value.icon}",
"valuePath": "$",
"color": "${value.color}",
"precision": 2}}]}
The above example uses two $switch commands in a $project stage to compute values for the icon and color. Then, it uses string interpolation to compute the values in the widget.

These values are dynamic: depending on the data in the statement, you may see either a red “X” or a green check.

$list

The $list command works identically to the $notice command, except it lists each document in the incoming stream.


The same query in the earlier notice example produced the widget above, with the command $notice changed to $list.

$heatChart

A heat chart builds a two-dimensional grid of boxes, colored according to a metric. Although you can use it with the $punchCard command, the default document formats don't match. Therefore, you'll have to provide a bit of logic before the rendering command. You can also use it to visualize any list of documents with two category axes and one value axis.
{   "$heatChart": {
"categoryPathY": <Must be a simple property name. Not a full JSON Path>,
"categoryPathX": <Must be a simple property name. Not a full JSON Path>,
"valuePath": <Must be a simple property name. Not a full JSON Path>}}
  1. categoryPathX — The property name on the incoming documents that identifies the position on the X-axis.
  2. categoryPathY — The property name on the incoming documents that identifies the position on the Y-axis
  3. valuePath — The property name on the incoming documents that identifies the value at the given X and Y.
{   "filter": {},
"process": [{
"$punchCard": {
"path": "timestamp",
"metrics": {"score": {"$avg": "result.score.scaled"}}}}, {
"$unwind": "$hours"}, {
"$project": {
"day": "$day",
"hour": "$hours.hour",
"count": "$hours.count",
"score": "$hours.score"}}, {
"$heatChart": {
"categoryPathX": "day",
"categoryPathY": "hour",
"valuePath": "score"}}]}
Above, you can see the projection needed to prepare the data.



You can see that the command uses the average score to plot the color. Light gray is 0.38 and dark gray is 0.64. The process computes the min and max values automatically.

$ganttChart

The $ganttChart widget displays sessions as bars on a date axis. Its defaults align with the $paths command, so you can show $paths as sessions on a Gantt chart without any configuration.
{   "$ganttChart": {
"categoryPath": <the “row” identifier for sessions on the same row. Default: "_id">,
"startPath": <the path to a timestamp that opens the session>,
"endPath": <the path to a timestamp that closes the session>}},
The $ganttChart command does not support any additional configuration.
{   "filter": {},
"process": [{
"$paths": {
"sessionPath": "context.registration"}}, {
"$ganttChart": {}}]}
The above process creates a row for every different session, because the _id of the each $path is a context.registration, which is unique per session.



This query creates a very large chart! You can zoom and scroll with the bars on the top and right.
{   "filter": {},
"process": [{
"$paths": {
"sessionPath": "context.registration",
"metrics": {"actor": {"$last": "actor.id"}}}}, {
"$limit": 50}, {
"$ganttChart": {
"categoryPath": "actor"}}]}
In the above query, the $last special metric grabs the name of the actor in the session. Also, it has a limit on the amount of data to show. With the categoryPath set to actor, rows have the actor's name, and sessions for the actor appear on the same row.



$chordChart

{   "$chordChart": {
"titlePath": <a path to a title for display>}}
The $chordChart command takes only one parameter, the titlePath. It uses it to title the nodes in the legend. You can use projection logic or $toCanonicalDisplay to generate titles other than the IDs.


Chord charts only make sense when the order value of the $pathsToNodes is false.

You can use the titlePath parameter along with $toCanonicalDisplay to map the IDs to names.
{ "filter": {},
"process": [{
"$toCanonicalDisplay": {
"path": "id"}}, { "$chordChart": { "titlePath": "canonical.display"}}]}

The chart selects colors based on the node title. In the above example changing the node titles changed the colors.

$sankeyChart

You can use a Sankey chart to visualize the flow through a series of steps. The order parameter of the preceding $pathsToNodes must be true, or else the graph will cycle, and VQL cannot process charts with cycles.
{   "$sankeyChart": {
"titlePath": <a path to a title for display>}}
The Sankey chart shows falloff as a fading line.


{   "filter": {},
"process": [{
"$toCanonicalDisplay": {
"path": "objectId"}}, {
"$sankeyChart": {
"titlePath": "canonical.display"}}]}
If you map a node to some value for display, then use the objectId. This is because the id field uniquely identifies the node. If order is false, then several nodes match each objectId, depending on the location in the path. Therefore, those node IDs have an additional integer appended.


The chart selects colors based on the node title. In the above example changing the node titles changed the colors.

Using the above tools, you can visualize paths and flows. However, that's not all! You can do logic over the nodes before the rendering command or even use them as inputs to further logic. As an exercise for the reader — How would you find the average score of students that took the Standard Precaution Scenario first?

$mergeCharts

There are several scenarios where you might generate multiple different charts in a single command. Consider…
{   "filter": {
"actor.id": {
"$each": ["rob@veracity.it", "tom@veracity.it", "jason@veracity.it"]},
"activity.id": "https://www.example.com/module"},
"process": [{
"$frequentValues": {
"path": "result.response"}}, {
"$barChart": {}}]}
The above query will generate three bar charts in parallel! See $each for an explanation. VQL does not support a query result that is an array of charts — you must combine them or call the rendering command once for each result. The $mergeCharts command can merge multiple bar charts, multiple pie charts, or multiple line charts into a “multi-series” bar, pie, or serial chart. A multi-series bar or pie differs in that bars and slices are not split but grouped. The process adds a legend with the series titles.



The above was generated by:
{   "filter": null,
"process": [{
"$query": {
"filter": {
"$each": [{
"actor.id": "mailto:musufdik@latiwri.sc"}, {
"actor.id": "mailto:ece@pi.si"}, {
"actor.id": "mailto:letir@wimameda.tn"}]},
"process": [{
"$frequentValues": {
"path": "object.id"}}, {
"$toCanonicalDisplay": {"path": "_id"}}, {
"$barChart": {
"categoryPath": "canonical.display",
"seriesTitle": {"$context": "$.filter['actor.id']"}}}]}}, {
"$mergeCharts": {}}]}
The $mergeCharts must be the last in the outer process. This example also has a nice example of using $context in $each. The $mergeCharts command takes no parameters and will error if the charts are not all of the same type.

Split Series Rendering

Several rendering commands support “split series”, where each document in the stream has an array field. If you correctly configure it, then the process “splits” the widget — so that a bar chart has stacked bars, a pie chart becomes a sunburst, or a serial chart has multiple lines.

The $barChart (and $pieChart and $serialChart) commands can detect a split series. To enable this, you must first have a document stream that has a property that is itself a list of objects. You would usually do this with a $frequentValues that has a special metric of $frequentValues or $timeSeries. If the data has a subseries, then use a JSON path to select two subseries: the valuePath and categoryPath. The defaults align to in most cases.
{   "filter": {},
"process": [{
"$frequentValues": {
"path": "actor.id",
"metrics": {
"objects": {
"$frequentValues": {
"path": "object.id"}}}}}, {
"$barChart": {
"valuePath": "objects[*].count",
"categoryPath: "objects[*]._id"}}]}
In the example above, the categoryPath and valuePath use array notation to select the values in subseries, objects. The * lets the widget know that you wish to split the series. The categoryPath and valuePath both select the same subseries: this is mandatory.


By default, the title of each split is the _id of the parent object. You can override this by using the stackCategory property. Consider:
{   "filter": {},
"process": [{
"$frequentValues": {
"path": "actor.id",
"metrics": {
"objects": {
"$frequentValues": {
"path": "object.id"}}}}}, {
"$toCanonicalDisplay": {}}, {
"$barChart": {
"valuePath": "objects[*].count",
"categoryPath": "objects[*]._id",
"stackCategory": "canonical.display"}}]}
Because the split series uses the categoryPath, use stackCategory to tell the widget how to title the splits.



Recall that $toCanonicalDisplay can also map subseries. We'll leave as an exercise to the reader to figure how to change the legend verb IDs into verb display titles.

The $pieChart command also supports a split series. When rendering a split series, the $pieChart generates a visualization called a “sunburst” diagram.


Because a $frequentValues command can have a $timeSeries special metric, you can configure the $serialChart stage to produce a split series chart.
{   "filter": {},
"process": [{
"$frequentValues": {
"path": "actor.id",
"metrics": {
"objects": {
"$timeSeries": {
"path": "timestamp"}}}}}, {
"$serialChart": {
"categoryPath": "objects[*]._id",
"valuePath": "objects[*].count"}}]}


Like stackCategory and stackTitle, use lineCategory and lineTitle to name subseries.
{   "filter": {},
"process": [{
"$frequentValues": {
"path": "actor.id",
"metrics": {
"objects": {
"$timeSeries": {
"path": "timestamp"}}}}}, {
"$toCanonicalDisplay": {}}, {
"$serialChart": {
"categoryPath": "objects[*]._id",
"valuePath": "objects[*].count",
"lineCategory": "canonical.display"}}]}
The lineCategory field above selects the value from the root object. This is because you don't name each point on a line, only the whole line itself. This differs somewhat from a split bar or pie, where you can name each split.


The Difference between titlePath, categoryPath, stackCategory, and stackTitle
In general, use the categoryPath or stackCategory values to organize the splits. It is possible that two bars might have different identities, but you wish to display the same title. Imagine two actors with the same name. Because they are distinct identities, they should have two bars, but both bars would have the same title. In this case, use categoryPath to tell the renderer what makes a document unique, and titlePath to select the title to display.

Customizing the Renderer

Because the rendering expressions return JSON objects, you can continue to process the document array after the rendering command! You can use this little trick to customize the colors, axis, labels, or any other components of a chart. Veracity uses amCharts for rendering, so refer to their JSON specification to learn about all the things you can change. Below are a few examples:


Starting with this command,
{   "$addFields": {
"chart.series.0.tensionX": 0.78,
"chart.series.0.fill": "#333",
"chart.series.0.stroke": "#fff",
"chart.series.0.strokeWidth": 2,
"chart.series.0.strokeDasharray": "3,3",
"chart.series.0.strokeOpacity": 1}}


To color bars in a $barChart, you'll must attach a color value to each entry in the array, then ask the chart to use it. This process colors the bars randomly:
{   "$addFields": {
"color": {"$arrayElemAt": [["red", "green", "blue"], {"$rndInt": {"max": 4}}]}}}, {
"$barChart": {
"parseDates": false}}, {
"$addFields": {
"chart.series.0.columns.template.propertyFields.fill": "color",
"chart.series.0.strokeOpacity": 1}}


Or you can use some property of the incoming values to compute a color:
{   "process": [{
"$spread": {
"path": "result.score.scaled",
"max": 1.00,
"min": 0,
"count": 6}}, {
"$addFields": {
"color": {"$concat": ["rgb(0,100,", {"$toString": "$count"}, ")"]}}}, {
"$barChart": {
"parseDates": false}}, {
"$addFields": {
"chart.series.0.columns.template.propertyFields.fill": "color"}}]}


A $pieChart uses slices instead of columns. Here, you compute a gradient, coloring the slice based on its position in the array.
{   "process": [{
"$spread": {
"path": "result.score.scaled",
"max": 1.00,
"min": 0,
"count": 15}}, {
"$addFields": {
"color": {"$concat: ["rgb(0,100,", {"$toString": {"$floor": {"$multiply": ["$_id", 255]}}}, ")"]}}}, {
"$pieChart": {
"parseDates": false}}, {
"$addFields": {
"chart.series.0.slices.template.propertyFields.fill": "color",
"chart.series.0.strokeOpacity": 1,
"chart.series.0.slices.template.stroke": "black"}}]}



Below is an example that changes the background-color:
{   "process": [{
"$spread": {
"path": "result.score.scaled",
"max": 1.00,
"min": 0,
"count": 15}}, {
"$serialChart": {
"parseDates": false}}, {
"$addFields": {
"chart.series.0.fill": "rgba(0,0,0,0)",
"chart.series.0.fill": "rgba(0,0,0,0)",
"chart.series.0.strokeOpacity": 1,
"chart.series.0.strokeWidth": 5,
"chart.series.0.stroke": "#F74",
"chart.background.fill": "white",
"chart.xAxes.0.renderer.grid.strokeOpacity": 0.2,
"chart.yAxes.0.renderer.grid.strokeOpacity": 0.2,
"chart.xAxes.0.renderer.baseGrid.disabled": true}}]}

You can even radically change the chart by setting the chart and series types:
{   "process": [{
"$spread": {
"path": "result.score.scaled",
"max": 1.00,
"min": 0.0,
"count": 20}}, {
"$serialChart": {
"parseDates": false}}, {
"$addFields": {
"chart.series.0.fill": "#8F9",
"chart.series.0.strokeOpacity": 1,
"chart.series.0.strokeWidth": 5,
"chart.series.0.stroke": "#6F4",
"chart.background.fill": "white",
"chart.xAxes.0.renderer.grid.strokeOpacity": 0.2,
"chart.yAxes.0.renderer.grid.strokeOpacity": 0.2,
"chart.yAxes.0.renderer.grid.strokeWidth": 2,
"chart.xAxes.0.renderer.baseGrid.disabled": true,
"chart.type": "RadarChart",
"chart.series.0.type": "RadarSeries"}}]}

This just touches on the absolute minimum of possibilities. Read the amCharts documentation for more information.

Calling the Renderer Remotely

Remember that the output of the query is not a chart, but a JSON object that describes the chart. Some parts of the LRS can send the results of the query to the renderer transparently. For instance, if you enter VQL in an Advanced Chart Builder widget in a custom dashboard, it's rendered automatically.

However, you can use our client library to render the result and return it to an external web page. Calling on our client library does not send the data to our servers; it just pulls in some logic into your code.

In your HTML, add the following line in the head:
<script src="https://lrs.io/integrations/public/vqlUtils/renderer.js"></script>
In the body, create a div and give it an ID:
<div id="chart"></div>
And call the VQL renderer:
vqlRender("chart", yourVQL_result, amCharts_theme_name, amCharts_theme_url, background);
This inserts an iframe to the #chart div, which pulls in our renderer. The LRS automatically sends the yourVQL_result data to the iframe when ready to render. In the above example, you're responsible for getting the VQL result to the client. You can either run the query via AJAX or run the query server-side and template the resulting JSON into a server-side HTML page render.



Notes
amCharts, Elasticsearch, MongoDB are all registered trademarks of their respective companies.



    • Related Articles

    • Veracity Learning Basics

      Veracity Learning is an easy to use, cloud or self-hosted Learning Record Store (LRS). The most essential feature of an LRS is that it receives and processes eXperience API (xAPI) data transactions. xAPI is a new standard for moving around ...
    • Versions of Veracity Learning

      Veracity Learning LRS comes in a few versions. The main version, and the one you're most likely looking at right now, is LRS.io. LRS.io (SaaS Hosted Plans) This is our cloud hosted SaaS service, where anyone can create an LRS. You can use the free ...
    • Custom Dashboards and Charts

      The Veracity Learning LRS offers several ways to generate custom dashboards. At the most basic level, you can place widgets (charts or lists) on a dashboard page. You can customize each widget with a set of options like what actor, what object, or ...
    • Advanced Search and Aggregation

      Veracity exposes a few extensions to our interfaces that allow you additional options for querying and analyzing data. Advanced Search Our query builder interface lets you filter on stacks of almost any property in the official xAPI specification, ...
    • Saved Scripts

      You can use the Saved Script feature to store scripts in VQL format. You can attach these scripts to the read or write functions of an access key, or the outgoing data that the LRS sends via a forwarding rule. In the future, more areas of the xAPI ...