Skip to main content
Pure Technical Services

Best Practices for Splunk on Pure Storage

This document covers the various best practices for Splunk on Pure Storage. This includes the Splunk Classic architecture with Hot/Warm on Pure FlashArray, cold on Pure FlashArray over FC/iSCSI, or FlashBlade over NFS as well as Splunk SmartStore architecture with data on the Pure FlashBlade over S3.

Scope

Splunk Classic architecture

  • Hot/Warm/Cold on Pure FlashArray over FC/iSCSI
  • Cold on Pure FlashBlade over NFS 

Splunk SmartStore architecture

  • Hot/Warm Cache on Pure FlashArray over FC/iSCSI (or DAS)
  • Warm remote on Pure FlashBlade over S3

 

Splunk Classic Architecture

Pure Volumes on FlashArray (Hot/Warm, Cold)

Configuring volumes for Splunk indexers could not be any simpler: due to the unique capabilities of flash and the design of the Purity Operating Environment, the factors below are neither relevant nor significant on FlashArray.

Factors

Relevancy

Details

Stripe width and depth

Automatic

Purity Operating environment automatically distributes data across all drives in the array

RAID level

Automatic

Pure FlashArray uses RAID-HA, designed to protect against three failure modes specific to flash storage: device failure, bit errors, and performance variability

Intelligent data placement

Insignificant

Purity Operating Environment has been designed from the ground up to take advantage of flash’s unique capabilities as they are not constrained by the disk paradigm anymore, and, as such, “hot” and “cold” disk platter placements are not relevant

For ease of bucket management, and to enable backups of Warm or Cold buckets, we recommend using separate Pure volumes for Hot/Warm, Cold, and Frozen buckets (if you decide to use Frozen on FlashArray) per indexer.

Bucket Type

Volume count

Location in indexes.conf

Hot/Warm

1 FA volume per indexer

Separate volume stanza for Hot/warm buckets like
[volume:hot]
path = /hot/splunk

Cold

1 FA volume per indexer

Separate volume stanza for Cold buckets like
[volume:cold]
path = /cold/splunk

Frozen

1 FA volume per indexer

coldToFronzenDir or coldToFrozenScript under each <index> stanza

 

Make sure to mount these FlashArray volumes on the relevant indexers onto the same mount point like “/hot” or “/cold” for the indexes.conf to be effective on the indexer.

As Pure FlashArray volumes are always thin-provisioned, Splunk Administrators can provision a large-sized volume to avoid adding additional volumes to meet the space growth.

Keep all the FlashArray volumes for all the indexers in a cluster at the same size to avoid imbalanced space usage.

Linux Mount options

You are welcome to use either EXT4 or XFS filesystems on the Splunk Indexers to mount the FlashArray volumes.  As buckets age and when directories are removed, the underlying block storage has to be issued with TRIM/unmap commands to reclaim the space.  To accomplish this,  you can use the discard mount option which will issue the TRIM command to FlashArray to release the space occupied by those directories. 

Following are the recommended mount options:

discard,noatime

If the discard option is not a preferred option based on your standard operating procedure, make sure to issue the fstrim command on the mount point periodically, once a day or once a week, to release the space at the FlashArray level.

Logical Volume Manager

Recommended using the logical volume manager (LVM) at the indexer level to attach the FlashArray volume to a volume group and carve out the logical volume for the Hot/Warm or Cold tier out of it.  This enables dynamic storage addition when the indexer needs more storage space for the Hot/Warm or Cold tier when they are hosted on Pure FlashArray.

Linux Best Practices

The Linux recommended settings for FlashArray, including multipathing queue settings, are documented under the Solutions page at the Pure Storage support site.

https://support.purestorage.com/Solu...ended_Settings

Cold tier on Pure FlashBlade

FlashBlade filesystems

  • Always create a separate NFS filesystem for every indexer to host the Cold tier.

  • FlashBlade filesystems are always thin-provisioned, Splunk Administrators can provision a large-sized filesystem to avoid updating the size to meet the space growth.

  • Do not set the hard limit parameter for the filesystem size as this will limit the flexibility of adding more space as needed.

  • Keep all the NFS filesystems for all the indexers in a cluster of the same size to avoid imbalanced space usage.

Linux Mount options

 Use the following mount options to mount the NFS filesystem on the indexer nodes for the Cold tier. 

rw,bg,nointr,hard,tcp,vers=3,rsize=16384
  • Always mount the filesystem with "hard" mount option and do not use "soft" NFS mounts.
  • Do not disable attribute caching.
  • Do not specify the wsize option as the host can get the default size offered by FlashBlade (512K).

Note: Changing the default rsize from 512K to 16K (or even 32K) offers a better read performance.

Splunk do not recommend placing Hot/Warm tier on NFS.  Please see Splunk documentation for more details.

Splunk SmartStore Architecture

Remote warm tier on FlashBlade

The minimum Purity//FB version to run Splunk SmartStore on FlashBlade is 2.3.0. 

This includes all the object-related functionalities that are required to host Splunk SmartStore index data on FlashBlade using S3 protocol.

Remote Volume

The volume definition for the remote storage in indexes.conf points to the remote object store where Splunk SmartStore stores the warm data.  The remote volume definition looks like the following.

[volume:remote_store]
storageType = remote
path = s3://<bucket name>
# The following S3 settings are required only if you’re using the access and secret
# keys. They are not needed if you are using AWS IAM roles.
remote.s3.access_key = <access key of the account that holds the bucket>
remote.s3.secret_key = <secret key of the account that holds the bucket>
remote.s3.endpoint = http://<FlashBlade-data-vip>

[splunk_index]
remotePath = volume:remote_store/$_index_name
repFactor = auto
homePath = <home path specification>
  • Each remote volume definition can have only one path meaning a single S3 bucket name

  • The remote volume which refers to the S3 bucket on a FlashBlade should be limited to an indexer cluster or a standalone indexer.  The same S3 bucket cannot be shared across two clusters or standalone indexers.

  • An indexer cluster or a standalone indexer can have one or more remote volumes.  

  • A SmartStore index is limited to a single remote volume and cannot be spread across multiple remote volumes.

  • All peer nodes of an indexer cluster should use the same SmartStore configurations.

Splunk related settings 

Bucket Size

Splunk has predefined sizes for the bucket that can be configured under the maxDataSize parameter in indexes.conf as

maxDataSize = <positive integer> | auto | auto_high_volume

Default is “auto” at 750MB whereas auto_high_volume is 10GB on 64-bit systems and 1GB on 32-bit systems.

The general recommendation by Splunk for a high volume environment is to set the bucket size to auto_high_volume but for Splunk SmartStore indexes, the specific recommendation is to use “auto” (750MB) or lower. This is to avoid timeouts when downloading big sized buckets from the remote object store back to the cache.

Recommended setting:

maxDataSize = auto
TSIDX Reduction

SmartStore doesn’t support TSIDX reduction. Do not set the parameter enableTsidxReduction to “true” for SmartStore indexes.

Recommended setting:

enableTsidxReduction: false
Bloom Filters

Bloom filters play a key role with SmartStore in reducing the download of tsidx data from the remote object store to the cache. Do not set the parameter createBloomfilter to “false.”

Recommended setting:

createBloomfilter: true
Versioning

FlashBlade supports versioning which is required by SmartStore to protect against any accidental deletion.  Splunk data is generally deleted when it surpasses the configured data retention period.  Setting this parameter to false on S3 storage like FlashBlade that supports versioning allows Splunk to put a delete marker on the objects rather than physically deleting them which makes it possible to protect against the accidental deletion.  If this parameter is set to true, which is the default setting, all versions of the data are deleted permanently by Splunk SmartStore when it ages out and cannot be recovered.

Recommended setting:

remote.s3.supports_versioning = false 

It is imperative that the versioning setting is enabled at the FlashBlade bucket level upon creation as the default is no versioning.  The following picture shows how to enable the versioning of a bucket through FB GUI.

clipboard_e7ec424c4eedd8d58fa1e425155c548aa.png

In case if the Purity//FB version (below 3.0) doesn’t support the online enablement of the version, use the following AWS CLI command to enable the bucket versioning. 

aws s3api put-bucket-versioning --bucket <bucket-name> --versioning-configuration Status=Enabled
Space Reclamation

As the parameter remote.s3.supports_versioning is set to false, the data is not physically removed when data ages out.  Hence it is recommended to set a lifecycle policy at the FlashBlade S3 bucket level to physically remove the deleted data and reclaim the space.

As of Purity//FB 3.0, the lifecycle policy can only be set through python code and not through the GUI.

The option to enable lifecycle policy through GUI is planned for the future Purity//FB release. 

Following is a sample python code that can be used to set the lifecycle policy of a given bucket in a FlashBlade.  This code will remove all noncurrent versions of the objects (deleted or overwritten objects), say after 7 days.  Please update the value for NoncurrentDays as per your requirement.  

import boto3
s3 = boto3.resource(service_name='s3', use_ssl=False,
      aws_access_key_id='<access_key>',
      aws_secret_access_key='secret_key',
      endpoint_url='http://<FB data-vip>')

s3.meta.client.put_bucket_lifecycle_configuration (
  Bucket='<bucket-name>',
  LifecycleConfiguration={
    'Rules': [
      { 'ID' : 'rule1',
        'Filter' : { 'Prefix' : '/', },
        'Status' : 'Enabled',
        'NoncurrentVersionExpiration': { 'NoncurrentDays': 7 },
      } 
     ] 
    } 
   )
Multi-part upload/download

FlashBlade supports multipart upload and download and the default setting of 128MB should be good enough and recommended not to modify unless the new value has been proven to improve throughput.

List Object Version

FlashBlade supports objects listing version V2 which is much more performant than V1.  To improve performance when Splunk is dealing with objects, V2 is highly recommended.

Recommended setting:

remote.s3.list_objects_version = v2

Cache Manager settings

Cache Manager plays a vital role in maximizing the search efficiency by managing the local cache intelligently. The cache manager favors holding the buckets that have high chances of participating in future searches and when the cache fills up, it evicts the buckets that are least likely to participate in future searches. For more information on how CacheManager works please see SmartStore Cache Manager

CacheManager settings generally have “global” scope and configured under the [cachemanager] stanza in server.conf.  In an indexer cluster environment, the settings are configured at each index peer node.

Except for the “recency” settings, any other CacheManger settings cannot be applied at an index level.

eviction_policy

Splunk recommends not to change the default eviction policy of lru which evicts the buckets that are least recently used.  

max_cache_size

Specify the maximum size for the disk partition that hosts the cache in megabytes.  This setting is applied at an indexer level and not the maximum cache size across the cluster.  When the occupied space of the cache exceeds the max_cache_size, or falls below the sum of minFreeSpace and eviction_padding, the cache manager will start to evict the data.

hotlist_recency_secs

Splunk SmartStore eviction policy generally favors the least recently searched buckets meaning the cache manager will keep the buckets that are searched recently and evict the buckets that are least recently searched even if the bucket was recently created. 

If most of your searches are on the recently ingested data, it makes more sense to protect this data from being evicted using the hotlist_recency_secs parameter.  This parameter sets the cache retention period based on the bucket’s age (aka recency) of the warm buckets in the cache and helps to protect the recent buckets over other buckets.  This setting overrides the eviction policy.

The recency or the bucket age is determined by the interval between the bucket’s latest time and the current time.  As the name implies, the setting is in seconds and the default is 86400 seconds or 1 day.  The CacheManager will not evict the buckets until they reach this configured setting unless all other buckets have already been evicted.

Setting can be at an index level or at the global level within the indexes.conf file but the recommendation is to set this parameter at an index level to favor protecting data in critical indexes over non-critical indexes.

For optimal functionality of cache eviction, set this parameter in consideration with the max_cache_size settings.  Do not set a value for hotlist_recency_secs that would require cache size beyond the max_cache_size value could provide as this can impact the cache eviction functionality. 

For example, if the daily ingest adds 100GB of new buckets daily, a cache size of 500GB can only hold five days of recent data, and hence any hotlist_recency_secs over 5 days would impact the cache eviction to work optimally.   Alternatively, if your search is always within the last 30 days and limited to the data ingested within the last 30 days, set hotlist_recency_secs to 2592000 seconds or 30 days and make sure the max_cache_size can hold 30 or more days of daily ingest data.

Recommended setting:

Please set the hotlist_recency_secs parameter at the index level for critical indexes in indexes.conf to protect the data in the cache from eviction based on the required age and in alignment with the max_cache_size settings.  

hotlist_bloom_filter_recency_hours

Similar to hotlist_recency_secs, the hotlist_bloom_filter_recency_hours parameter protects the metadata files like bloomfilter from eviction.  The use of bloom filters during searches avoids the need to download larger bucket objects like the rawdata journal file or the time series index files (tsidx) from the remote object storage.

The default setting is 360 hours or 15 days.  With this setting, the cache manager will defer eviction of smaller files like bloomfilter until the interval between the bucket’s latest time and the current time exceeds this setting. If the searches are limited to the recently ingested data within say last n days, set this parameter for all the critical indexes to the hour that corresponds to n days.  If the search is limited to the last 30 days, set this parameter to 720.

Recommended setting:

Please set the hotlist_bloom_filter_recency_hours parameter at the index level for critical indexes in indexes.conf to protect the data smaller metadata files in the cache from eviction based on the required age.