ds_core - Datasource Query Core
The datasource core (ds_core) module provides a modular data access and caching layer for consumption by other Momentum modules. Datasource driver modules can be loaded to provide access to third-party data stores (such as LDAP, ODBC, etc.). A C-language SDK is available to allow third parties to build their own datasource drivers for use with the datasource layer.
The datasource layer is designed to cache query results to avoid overwhelming the data store. Each cache is configured to query against a particular data store and can have distinct size and time-to-live (TTL) settings for the data, allowing an administrator many opportunities for tuning.
The ds_core module is loaded implicitly as required and need not be specified in the configuration file. Nor do datasource drivers need to be explicitly loaded. However, you must add the appropriate database modules during installation.
The following options can be specified on a per-cache basis:
- cache_life
-
Specifies the time-to-live of a given cache entry, in seconds. Default value is
3600
seconds. - cache_size
-
Maximum number of cache entries to store in the cache. During cache maintenance, if the cache size exceeds this amount, the oldest cache entries will be dropped until the cache size returns to the configured size. Default value is
1000
. - max_rows
-
Maximum number of rows to fetch while performing a query. This acts as a brake to prevent accidental (or perhaps deliberate) querying of a very large number of rows from a database. Default value is
100
rows. To return an unlimited number of rows use-1
. Note: When querying a CSV data source you have no option but to return all rows. Otherwise, setting this value to-1
is not recommended. - no_cache
-
When set to
true
, query results will not be cached. Default value isfalse
. - pool
-
Systems with a large IO pool may run into an issue whereby Momentum maintains many idle connections to the datasource or datasources that are in use; this can potentially starve other machines of connections. By specifying the thread pool that is used, you can configure the datasource thread
pool
size independently of the IO pool size, dispatching datasource queries on a per-datasource basis. The threadpool referenced by this option must be defined. For more information about threadpools see threadpool. Default value isIO
. - serialized
-
When set to
true
, query results will be serialized and published across the cluster. This is most effective when used with queries that are expensive to run and where the results are likely to be needed on more than one node of the cluster within a short time span. To share a cache across a cluster, you also need to add a replicate stanza to the cluster module. For an example, see “Replicated Caches”. Default value istrue
.You are encouraged to create a separate cache dedicated to this type of query, should you require it.
- uri
-
Contains all the details required to locate a driver and connect to the data source. There is no universal format, except that the start of the string, up to the first colon (
:
) character specifies the driver name.Multiple
uri
lines can be provided and will be used in a round-robin fashion for simplistic load balancing. It is important that all such datasources return identical results, since each URI will be chosen sequentially. All URI's are considered equivalent (i.e. no one URI can be marked as "preferred"), and no failover action will be attempted (except in the sense that "dead" connections will be skipped).The
uri
option is an array, letting you specify more than one datasource. The following is an example configuration:Datasource "mssql" { uri = ( "odbc:DSN=sqlserver;UID=user;PWD=pass" "odbc:DSN=sqlserver2;UID=user2;PWD=pass3" "odbc:DSN=sqlserver3;UID=user3;PWD=pass3" "odbc:DSN=sqlserver4;UID=user4;PWD=pass4" ) cache_size = 200 cache_life = 360 }
The following is an example configuration stanza:
Datasource "mssql" {
uri = ( "odbc:DSN=sqlserver;UID=user;PWD=pass" )
cache_size = 1000
cache_life = 3600
}
This example loads the datasource core and the ODBC datasource driver and configures a cache to query against a MS SQL Server It creates a cache named mssql
that will store the results of up to 1000 distinct queries. A datasource must have a name to identify a given datasource instance in order to apply configuration changes. Cache items will remain in the cache for up to 3600 seconds.
If a cache miss occurs, the uri
parameter specifies how to connect to the datasource to fetch the data. The syntax for uri
is the driver name followed by a single colon character. Everything after the colon is passed verbatim to the driver in order to create the connection. What is passed varies depending upon the requirements of the driver. In the example, the SQLite driver only requires the driver name followed by the path to the database. The ODBC driver shown above requires a DSN, a user ID and password in addition to the driver name. Check driver documentation for the exact syntax.
Note
The cache name is a token of your choosing. Typically, the name reflects the database type as it does in the preceding examples. However, the cache name is the value that must be passed to Lua database functions when querying a datasource. Note: "Cache name" and "datasource name" are used interchangeably throughout the Message Systems documentation.
Datasource queries are run in one of the thread pools (almost always the IO thread pool, as the data fetch is an IO-bound operation), which allows Momentum to continue serving other sessions while a query completes. You may specify the threadpool using the pool
option.
Datasources are not modules so you cannot determine their names using the module list command. Use config locate datasource instead.
To set datasource options from the console, use the syntax config set datasource ds_name option value
. For example, set the cache_life
of the datasource defined in “Specifying a Datasource” in the following way:
Where possible, the datasource layer determines the charset encoding of the data and stores it in the cache in the form requested by the first query for that data. This will usually be for UTF-8 data.
If a driver is unable to determine the charset, Momentum will assume that the data is encoded using ISO-8859-1 (latin-1). You may override this assumption by setting the assume_charset
option in the datasource declaration:
Datasource drivers are loaded automatically as needed and do not need to be explicitly included in your configuration. The following sections list the available drivers.
For Lua scripting examples, see:
The SQLite driver provides access to any SQLite database. This driver is installed by default.
The ODBC driver provides database access to any data source for which an ODBC compliant driver is installed.
Datasource "mssql" {
uri = ( "odbc:DSN=sqlserver;UID=user;PWD=pass" )
cache_size = 200
cache_life = 360
}
The uri
consists of the string odbc:
followed by any valid ODBC connection string; the connection string is passed directly to the ODBC driver manager to establish the connection.
The ODBC driver accepts ?
and :named
style parameters in prepared statements.
Note
There is an ODBC-specific internal parameter, long_threshold
, used for determining whether a field data type should be treated as a long value. Changing this parameter without the guidance of support is not recommended.
If you're using Momentum 4.7 or later which use system's "unixODBC" package, the paths that refer to /opt/msys/3rdParty/
don't apply. For example, the configuration files odbc.ini
and odbcinst.ini
are to be placed under /etc
instead of /opt/msys/3rdParty/etc
.
The following apply when using unixODBC:
-
Package files are stored in directories below the
/opt/msys/3rdParty
directory. -
Library files are found in the
lib
directory (lib64
on 64 bit systems). -
Executables files are found in the
bin
directory. -
If you are installing a driver or configuring an ODBC DSN, use the odbcinst command found in the
bin
directory. For information on using this tool, see http://www.unixodbc.org/odbcinst.html. -
In order for unixODBC to work with Momentum, the configuration files
odbc.ini
andodbcinst.ini
must be installed in the/opt/msys/3rdParty/etc
directory. -
The Message Systems unixODBC package automatically adds itself to the appropriate unixODBC configuration files when installed. Likewise, when it is removed.
-
The configuration of a unixODBC datasource is identical to the configuration of an ODBC datasource.
The LDAP driver provides access to LDAP directories, including Microsoft ActiveDirectory. The connection URI should be an RFC 2255 compliant LDAP URL. Briefly, the format for RFC 2255 URLs is ldap://host:port/dn?attributes?scope?filter?extensions
Queries are also specified in the form of an LDAP URL, any components of the search criteria that are not specified in the query string are taken from the connection URI. Momentum supports the RFC 2255 bindname
extension for specifying the DN to use for a bind, the commonly used x-bindpw
extension to specify the associated password, and the additional x-ignore-referrals=(0|1)
extension to allow disabling following referrals. If you wish to have the DN of the resulting object returned as results of the query, you may use the 'x-return-dn' extension to define the attribute name to assign to the DN in the resulting data set.
By default, LDAP connections are cached and re-used for subsequent queries with the same hostname, port and bind parameters. If one wishes to disable such caching, the x-disable-connection-cache=true
extension may be used.
Datasource "mydirectory" {
uri = ( "ldap://10.80.116.112" )
cache_size = 200
cache_life = 360
}
The extensions supported by ds_ldap are:
- bindname
-
DN to use for a bind.
- x-bindpw
-
Password associated with the
bindname
. - x-connect-timeout
-
Number specifying the time-out in seconds for connecting to the LDAP server. If this time is exceeded, the query fails. The default is determined by your operating system, and is typically of the order of several minutes.
- x-disable-connection-cache
-
If
true
, disable re-use of LDAP server connections for queries with the same server parameters. - x-ignore-referrals
-
If
0
, disables following referrals. - x-ldap-version
-
Number specifying the LDAP protocol version to use.
Note
LDAPS queries that don't specify
x-ldap-version=3
are liable to fail. - x-return-dn
-
Attribute name to assign to the DN in the resulting data set.
- x-search-timeout
-
Number in seconds that specifies: how long ds_ldap should wait for a query to complete; how long the server should spend processing the query (the search time limit). If this time is exceeded, the query fails. Default value is
6
seconds.
The configuration options supported by ds_ldap are:
- connection_cache_size
-
Maximum number of connections to store in the cache. Once this limit is reached, the creation of a new connection will cause an existing connection to be closed, so that the new connection can be added to the cache.
- connection_cache_life
-
Maximum age of a connection in the cache, after which the connection is closed (and re-opened, if necessary). This option is specified in seconds.
To authenticate using Active Directory, create a configuration such as the following:
Datasource "ldap" {
uri = ( "ldap://example.com" )
no_cache = true
}
ds_ldap {}
auth_ds {
Scheme "ecauth" {
Authenticate {
query =
"ldap:///????bindname=CN=$user%2cCN=Users%2cDC=int%2cDC=omniti%2cDC=net,x-bindpw=$pass"
cache = "ldap"
map = [
user = "%{username}"
pass = "%{password}"
]
}
}
}
Control_Listener {
AuthLoginParameters = [
uri = "ecauth://"
]
Enable_Authentication = "true"
Enable_Authorization = "false"
AuthorizationParameters = [
uri = "ecauth://"
]
Listen "/tmp/2025" {
Enable_Authentication = "false"
}
Listen "127.0.0.1:2025" {
}
}
The Momentum LDAP driver supports a "bindname" extension for overriding the DN, user, and password used to connect to LDAP. The DN specified in the LDAP query is standard. In “Active Directory Authentication Configuration”, you'll notice the substitutions for both $user
and $pass
. The only non-standard additional bit is x-bindpw
, which is again supported by our LDAP driver.
You can limit access to specific roles defined in Active directory by creating an LDAP query to specify only a single group or multiple groups in the query. That query would look similar to:
(&(objectClass=user))(memberOf=cn=$user,cn=GROUPNAME,ou=ORGUNIT,dc=DCPART))
To a degree the format depends upon on how the LDAP database is set up, but generally speaking the user's name is the most specific part of the common name.
Note
If you choose to authenticate using Active Directory, you cannot also authenticate using the Momentum PostgreSQL database. If you wish to use both, you must consult with professional services.
The ds_ldap module can be controlled through the ec_console
. The following commands are available:
- ds_ldap flush connection cache
-
This command removes all entries from the connection cache.
- ds_ldap show connection cache stats
-
This command outputs statistics relating to the connection cache.
The CSV driver provides access to any comma-separated file. This driver is installed by default. The following code includes the CSV driver and defines the properties of the CSV cache:
Datasource "csv" {
uri = "csv:"
max_rows = -1
}
In this example, the uri
parameter defines the protocol, and the max_rows
parameter is set to -1
, indicating that an unlimited number of rows are being returned. Because CSV data sources return all rows, they are useful for look-up operations where every record must be examined.
The PostgreSQL driver provides access to any PostgreSQL database. The following code includes the PostgreSQL driver and configures a cache to query against a PostgreSQL database:
Datasource "pgsql" {
uri = ( "pgsql:host=example.com;dbname=db_name;user=user_name;»
password=secret;port=5432" )
}
This data source also makes use of the other configuration parameters described in “Configuration”.
The CDB driver provides access to any CDB database. This driver is installed by default. The following code includes the CDB driver and configures a cache to query against a CDB database:
Datasource "cdb" { uri = ( "cdb:/path/to/data.cdb" ) }
The uri
consists of the string cdb:
followed by the path to the CDB database.
The I/O Wrapper driver allows the contents of a file to be queried and returned as a single row consisting of a single column. While this does not sound hugely impressive on its own, it provides access to any resource for which an IO wrapper exists. For instance, loading the http_io module provides support for pulling data over HTTP.
The I/O wrapper driver is installed by default and supports $named
style parameters.
Datasource "io" {
uri = "iowrapper:"
cache_size = 200
cache_life = 360
}
The MySQL driver provides access to a MySQL database. Go to the Message Systems support website and download the MySQL driver suitable to your operating system, architecture, and version of Momentum. Install this driver by issuing the command: rpm -Uvh mysy-ecelerity-mysql-version.os.arch
.rpm rpm -Uvh ecelerity-mysql-version.os.arch
.rpm
Add the following to your configuration:
Datasource "mysqldb" {
uri = ( "mysql:host=localhost;port=3306;dbname=ecmail; »
user=user;password=password" )
}
Note
For licensing reasons the MySQL module does not ship with Momentum and must be downloaded separately.
The datasource module provides the following console commands:
- ds_core flush *`cache_name`*
-
The ds_core flush command takes a single argument specifying the name of the cache to flush. The following is the output of the command ds_core flush sqlite :
flushed 2 entries from cache 'sqlite'
In this case
sqlite
corresponds to a cache defined in the following way:Datasource "sqlite" { uri = ( "sqlite:/path/to/db" ) }
Datasources are not modules so you cannot determine their names using the module list command. Use config locate datasource instead.
- ds_core stats
-
For each cache, this command prints statistics including the hit rate, number of cache hits and misses during lookups, number of inserts, and number of deletions. The deletions are broken down into entries explicitly (manually) deleted, those deleted due to time-to-live (ttl) expiration as configured via
cache_life
, and least-recently-used (lru) items deleted due to thecache_size
being reached.13:07:12 ecelerity> ds_core stats cache name lookups % deletions % elts hit miss pend total inserts man ttl lru total ---------------------- ----- -------------------- -------- ------------------ ecdb 2 72 28 0 335 6 0 100 0 4
Note: Cache names may be truncated; only the first 10 characters are shown.