MarkLogic Index Data Types

Blog

MarkLogic Index Data Types

  • 28 July, 2022
  • By Dave Cassel
  • No Comments
blog-image

MarkLogic offers several types of indexes: Universal, range, triples. These indexes provide fast access to your content and can be configured to work with specific data types. MarkLogic will even do some type conversions for you.

Universal Index

Let’s insert a couple documents. Note the difference between the updated properties (“T” versus no “T”) and the types of the someNumber property.

'use strict';
declareUpdate();
xdmp.documentInsert(
  "/content/doc1.json",
  {
    "updated": "2022-07-13T00:00:00",
    "someNumber": 1
  }
)
xdmp.documentInsert(
  "/content/doc2.json",
  {
    "updated": "2022-07-12 00:00:00",
    "someNumber": "2"
  }
)

The Universal Index will store each of these values, along with the structure, as they are provided to MarkLogic. We can query those as soon as the transaction completes. To do so, we need to query for the specific value of the right type: cts.jsonPropertyValueQuery("someNumber", 1) will find doc1.json, but cts.jsonPropertyValueQuery("someNumber", "1") will not.

Range Indexes

Let’s set up 2 range indexes:

  • On the “updated” property with type “dateTime”
  • On the “someNumber” property with type “int”

I remember that at some point in the past, doc2.json would have been rejected, because a valid dateTime has to have a “T” between the date and the time. (In other words, xs.dateTime("2022-07-12 00:00:00") would fail.) MarkLogic changed that at some point; our sample data values, both with and without the “T”, can be passed to the xs.dateTime constructor successfully. If we ask MarkLogic for the values in the range index, we’ll see both dateTimes (with the “T”):

cts.values(cts.jsonPropertyReference("updated"))
=>
2022-07-12T00:00:00 2022-07-13T00:00:00
Likewise, we can do an inequality query whether our input has the “T” or not:
cts.search(
  cts.jsonPropertyRangeQuery(
    "updated", 
    ">=", 
    xs.dateTime("2022-07-12 00:00:00")
  )
)

Triples Index

The triples index, which powers both triples and views, also does this conversion. Let’s add a template:

'use strict';
 const tde = require("/MarkLogic/tde.xqy");
 const typeTemplate = xdmp.toJSON(
   {
     "template": {
       "context": "/",
       "directories": ["/content/"],
       "rows": [
         {
           "schemaName": "test",
           "viewName": "types",
           "columns": [
             {
               "name": "updated",
               "scalarType": "dateTime",
               "val": "updated",
               "invalidValues":"reject"
             },
             {
               "name": "someNumber",
               "scalarType": "int",
               "val": "someNumber",
               "invalidValues":"reject"
             }
           ]
         }
       ]
     }
   }
 );
 tde.templateInsert(
   "/test/typeTemplate.json" ,
   typeTemplate,
   xdmp.defaultPermissions(),
   ["TDE"]
 )

Now we can do a simple query and see that the values have been converted to their target types:

select * from test.types
test.types.updated test.types.someNumber
2022-07-13T00:00:00 1
2022-07-12T00:00:00 2

Note that our template doesn’t have any code to explicitly convert the values; MarkLogic just does it for us.

Impact

I find this implicit conversion especially helpful for xs.dateTime. Relational databases often use the format without the “T” in the middle. When ingesting data from such sources (or accepting queries from consumers that expect that format), the ingest process would need to add the “T” in order to match the expected format if the implicit conversion didn’t happen.

The key thing is to remember that the value in the document (and in the Universal Index) hasn’t changed — MarkLogic stores whatever is provided. If you have a property where the source doesn’t reliably provide the same type, remember that your value queries will need to match both type and value (as in the case for the someNumber property above).

Share this post:

quote
MarkLogic offers several types of indexes: Universal, range, triples. These indexes provide fast access to your content and can be...

4V Services works with development teams to boost their knowledge and capabilities. Contact us today to talk about how we can help you succeed!

0 0 votes
Article Rating
Subscribe
Notify of
0 Comments
Inline Feedbacks
View all comments
cta-bg

Partnering for Success on Data Projects

We work with companies like yours to improve business operations through better data management. Our role is to put you in a position to succeed. Let's talk about your goals and a plan to get you there.