Index naming, metadata, and templates
Table of Contents
Overview
We would like to be able to specify the path for a particular index, when the index is created.
There are three things that are required for us to reach a point where this would work.
First, we need to be able to avoid #8254 with all the complications of filesystem-specific naming of indices. Currently we use the directory name as the name of the index when re-importing it as a dangling index. This can lead to issues for Unicode index names.
Second, we need to be able to specify the path for the index's data when the index is created.
Third, we need to be able to specify a template for how the directory should be laid out, we can reuse the existing mustache template functionality inside of Elasticsearch for this, so we can do something like:
POST /test { "settings": { "number_of_shards": 5, "number_of_replicas": 1 "data_path": "/mnt/testdata", "path_template": "{{index_name}}/shard_{{shard_num}}" }, "mappings": { ... } }
Which would put data in /mnt/testdata/test/shard_0
,
/mnt/testdata/test/shard_1
, etc.
Steps
Here's what we need to do this:
1. Add functionality to read index name from a metadata file
Instead of using the directory name, we should read the name of the index from a
meta.json
file in the top-level index directory.
For the index "test", we would read:
/mnt/testdata/test/meta.json
Which would contain something like:
{ "name": "test" }
If this file doesn't exist, we can fall back to using the directory name for the index name.
2. Specifying the data_path
for an index
Using only the data_path
without a template should simply write all of that
index's data to the given location instead of the default path.data
configuration value.
3. Specifying a path_template
for an index
We should use a good default and document what the default is.