Elasticsearch Integration

Elasticsearch Integration

Overview

The LRS can accelerate its dashboard and statement viewer by leveraging Elasticsearch in addition to MongoDB. When configured with the optional Elasticsearch connection, the LRS will synchronize xAPI statements in real time between these two database products in real time. The system will detect analytics queries that can be fulfilled by Elasticsearch, then translate and dispatch them automatically. This process happens under-the-hood and is invisible to both the user interface and the API. However, the effects will be obvious — on larger data sets, many queries can be hundreds of times faster when accelerated in this way.

To synchronize data with Elasticsearch, the LRS must be supplied a connection string to an Elasticsearch server. We currently support Elasticsearch version 7.x. The LRS will also function when connected to OpenSearch in Elasticsearch 7.17 compatibility mode.
Warning
The Elasticsearch integration is required for users who wish to analyze large datasets. If the number of xAPI statements in an LRS is greater than the maxVQLWindowSize setting value, then some analytics queries will not consider the entire dataset. The interaction between maxVQLWindowSize and the given query is complex, and in many cases the results will be correct. However, in this state it's possible that the server will return incomplete results. If you wish to use MongoDB alone to analyze large data sets, then set maxVQLWindowSize to some arbitrarily large number. Also be aware of the globalMongoTimeout setting, which controls the maximum amount of time a query may process.

Configuration

There are several settings relevant to the Elasticsearch connection. As with all settings, these can be supplied as environment variables, command line parameters, or defined in the .env file.

Setting Name
Discussion
elastic­Search­Server
The connection string to Elasticsearch. This is always a web address, for example, http://192.168.1.2:9200. It should include the port number on which Elasticsearch is listening, unless that port is the default port for the given protocol (80 for HTTP and 443 for HTTPS). It should not include a path. If your Elasticsearch deployment is configured to use authentication, then supply the name and password as HTTP basic authentication parameters in URL form. For example, https://username:password@myserver.org:9200.
elastic­Reconnect­Timeout
A human interval like 30 seconds. The time after which the server will attempt to reconnect to Elasticsearch when a connection fails to be established. Default: 30 seconds.
es­Index­Extensions
Should xAPI extensions be included in the Elasticsearch mapping? Default: true. Because xAPI allows arbitrary data to be included in extensions, it is possible to generate data that Elasticsearch cannot ingest. This can happen when extensions change the data type at a particular path, such as using a string in some statements, but an integer in others. In these cases, Elasticsearch will drop the data. This can lead to large elastic­Missing­Ratio­Error values. Eventually, if too much data fails to be stored in this way, then the LRS will assume the Elasticsearch index is "unhealthy" and refuse to use it.
elastic­Timeout
A human interval like 30 seconds. The maximum amount of time the LRS will wait for a response from Elasticsearch. Requests longer than this time will be canceled. Default: 20 seconds. When such a request times out, downstream effects will occur. You may experience dashboard graphs disappear or render the "No data to display" error message.
elastic­Missing­Ratio­Error
A number between 0 and 1. Defines the ratio of the count of statements between Elasticsearch and MongoDB that is considered normal. Default: 0.99. This means that by default, an Elasticsearch index containing 99% of the MongoDB statements is healthy enough to use. While it may be tempting to set this value to 1.0 for maximum consistency, when under load statements are always stored in MongoDB before Elasticsearch. This can lead to a situation where Elasticsearch never has the same count as MongoDB, because of the ongoing traffic. Lower this ratio if, under high load, you see a notification describing the index as unhealthy.
es­Dynamic­Extensions
When the LRS encounters xAPI extensions, if es­Index­Extensions is true, then how are they treated? Default: true. Use true for regular Elasticsearch dynamic mapping or runtime to use runtime mapping. A discussion of the ramifications of each selection is beyond the scope of this document.
es­Keyword­Max­Length
A number. Strings beyond this length will be truncated in the mapping. Default: 256. Increase this value if you have identifiers that are very long. Too short a value can lead to different xAPI objects, agents, or activities to be considered the same when their IDs differ.

Permissions

The LRS will create and destroy Elasticsearch indexes as LRS tenants are created and destroyed. This means that the user account the LRS users to access Elasticsearch should have permission to create and delete indexes. The LRS will also query, update, and change the mapping of the indexes it controls.

Connectivity

As of LRS version 1.12.6, the LRS will attempt to reconnect to Elasticsearch if connectivity is interrupted. Normally, this should never occur, and you should endeavor to make sure Elasticsearch is always available. The reconnection feature is intended to mitigate only transient connection problems. If the Elasticsearch server must be shut down and you cannot stop the LRS from accepting new data, then the index will be out of date when connectivity is restored. Once this occurs, the index will have to be resynchronized from MongoDB.

Possible Error States

Issues with the Elasticsearch connection and index will be displayed in warning banners on the LRS home page, or the homepage for a particular tenant.
It looks like this server does not have an Elasticsearch connection. While optional, an Elasticsearch connection is highly recommended.
This message means that the server is not connected to Elasticsearch. It does not mean that no connection is configured — it only means that one is not active. This may mean either none is configured; or, if one is configured, then either that configuration is incorrect or the connection could not be established. Check the server configuration and logs for more information.
The analytics index for this LRS seems to be missing data. If this persists for more than a few minutes, please use the database management page to rebuild the analytics index.
The analytics index for the LRS is unhealthy. This can mean that one does not exist, or that one exists but contains an unexpected amount of data. The index will need to be rebuilt or resynced. In this state, new data will be written to Elasticsearch, but the analytics system will not accelerate queries.
The analytics index was built with a previous version of the LRS and needs to be rebuilt. Please use the database management page to rebuild the analytics index.
The LRS can detect differences between the expected and actual configuration of the Elasticsearch index for a given tenant. This message means that the LRS software was updated or downgraded, or the Elasticsearch index settings were manually changed. Either way, the Elasticsearch index is not in the expected state. It will need to be rebuilt.

Rebuilding the Index

Because the LRS stores data primarily in MongoDB, we can rebuild the Elasticsearch database from the information stored in MongoDB. Various error conditions may require you to rebuild the index or resync the data. Each of the two options below will launch a background processing job on the LRS that will complete asynchronously. These jobs can last from seconds to hours depending on the data sizes. Expect the process to take, generally, one second for every thousand xAPI statements. You can capture new data while this process runs but note that significant server resources will be required. Plan to run this process during times of low load. Each of these processes can be launched from All Management Tools > Database Management in each tenant.

Rebuild Analytics Index
This process will completely rebuild the analytics index for the given tenant. If an index exists, then it will be destroyed and recreated. If no index exists, then a new one will be created and configured. Use this when connecting to Elasticsearch for the first time, or if the existing index is built with an incompatible version of the LRS.

Resync Analytics Index
The LRS keeps track in MongoDB whether or not a statement was successfully inserted into Elasticsearch. This process will insert into Elasticsearch only statements that have not been successfully inserted before. Note that we do not compare actual IDs — if you manually remove data from Elasticsearch, then the LRS will be unaware of exactly what statements are missing, and a full rebuild will be required. Use this process to fix the index after an intermittent connection issue.


    • Related Articles

    • LMS Integration

      The Veracity Learning offers seamless integration with your LMS to allow instructors and students to access the services of the LRS without leaving their LMS environment. This is accomplished via LTI (Learning Tools Interoperability), a standard from ...
    • SQL Integration

      SQL Integration is an Enterprise only feature Veracity LRS can synchronize your xAPI statements into an SQL database in real time. The LRS will open a connection to your SQL server, and flush out statements every 300 milliseconds. Statements are not ...
    • Power BI Integration

      Integration with Power BI Integration is accomplished by pulling a saved statement viewer report into PowerBI. Before you can do this, you'll need an API key. Navigate to "All Mangement Tools" then find "Security" Click "API Keys" Click "Create New ...
    • Single Sign On (SSO) Integration

      Veracity can integrate with your single sign on provider using OpenID Connect or Security Assertion Markup Language (SAML). This integration allows your enterprise users to log into the LRS user interface without providing a password. Integrating a ...
    • Tableau Integration

      Veracity LRS supports Tableau via a Web Data Connector. This connector allows you to import your saved statement reports into Tableau for analysis. This tool can only pull saved statement viewer reports. If you have none, first create any report in ...