Index naming, metadata, and templates

Table of Contents

Overview

We would like to be able to specify the path for a particular index, when the index is created.

There are three things that are required for us to reach a point where this would work.

First, we need to be able to avoid #8254 with all the complications of filesystem-specific naming of indices. Currently we use the directory name as the name of the index when re-importing it as a dangling index. This can lead to issues for Unicode index names.

Second, we need to be able to specify the path for the index's data when the index is created.

Third, we need to be able to specify a template for how the directory should be laid out, we can reuse the existing mustache template functionality inside of Elasticsearch for this, so we can do something like:

POST /test
{
  "settings": {
    "number_of_shards": 5,
    "number_of_replicas": 1
    "data_path": "/mnt/testdata",
    "path_template": "{{index_name}}/shard_{{shard_num}}"
  },
  "mappings": {
    ...
  }
}

Which would put data in /mnt/testdata/test/shard_0, /mnt/testdata/test/shard_1, etc.

Steps

Here's what we need to do this:

1. Add functionality to read index name from a metadata file

Instead of using the directory name, we should read the name of the index from a meta.json file in the top-level index directory.

For the index "test", we would read:

/mnt/testdata/test/meta.json

Which would contain something like:

{
  "name": "test"
}

If this file doesn't exist, we can fall back to using the directory name for the index name.

2. Specifying the data_path for an index

Using only the data_path without a template should simply write all of that index's data to the given location instead of the default path.data configuration value.

3. Specifying a path_template for an index

We should use a good default and document what the default is.

Author: Lee Hinman

Created: 2014-11-13 Thu 11:15

Emacs 24.4.1 (Org mode 8.2.10)

Validate