SPLK-1002 Splunk Core Certified Power User – Splunk Indexer And Search Head Clustering Part 3

  1. Handson Indexer Clustering : part 03

Once we add this indexers, you should be able to see our indexer clustering successfully. The procedure is similar in case if we have more than ten indexers, how do you manage this configuration? So in that case, we’ll be seeing when we are building our architecture in Amazon aws that is multi site clustering along with high availability. We’ll be deploying all this configuration from our deployment server and you’ll see the ease of achieving this using deployment server.

So this is our second index go to settings indexer cluster enable indexer clustering it is this live node, the url of our master that is 80 89 replication port 90 80 this replication port should be the same among all your indexes so that these port will be used for exchange of data in between clusters. Let us go ahead and restart this. Once your restart is complete, you should be able to see the another pair added to your cluster master and data will be completely searchable. As of now, some data is not searchable because we have only one indexes.

Okay, this one took a while to come up. Let us log in back. As you can see now the error we are not able to notice any errors in our second index error because we have successfully added the number of required. This is a license error. You can ignore this because we have free license.

It says previously configured license manager is not reachable can ignore this. So as of now, the last time we were able to see this error as soon as we added our indexer one but when you add indexer two as you can see, this entire screen has been changed to reflect that all the configuration has been reported to master that is your cluster master.

Let us go ahead and refresh this. This error also is successfully clear because we have met the minimum requirement of replication factor. That is, number of indexers should be equal to your replication factor. That is the minimum criteria. The replication factor can be any value above the number of indexes. So as you can see now we have two data that is completely searchable.

Search factors and replication factors are not met because the data in between the indexes is not copied yet. As you can see, one indexer has these money buckets and other indexer it’s yet to copy this information into the other indexes over a period of time, this data will be copied across each indexes and you’ll be able to see same kind of check marks across each index. So this is how we configure single site indexer clustering at any given point of time. If this indexer one of these indexes goes down, the other indexer will be able to give you 100% of the results.

  1. Handson Indexer Clustering : part 04

In our previous video we have configured this indexer clustering using the web console. Now let us go to our linux cli where the configuration files are automatically generated as part of your indexer clustering and we will identify where this configuration are and what are the syntax for it. So this is our indexer one and indexer two. Let us see what all the configuration generated on our cluster master. To do that go to opt Splunk. That is your splunk home in windows. It will be C program files splunk. After that etc system local should be under server conf file.

As you can see the last chance are here. Let me put in a number. As you can see the line numbers 35 to 38. These are the configuration that was autogenerated when we enabled the clustering master node configuration. So it has generated four lines that is under clustering stancer. It has given it a name that we have provided in the splunk web single site cluster and the mode which it is operating is the cluster master. It has a replication factor of two.

By default search factor is two so that these configuration search factor configuration will be taken from our default directory of server. com. If you are not aware of how this configuration precedence works, I would highly recommend you to go back to our third or fourth module where we have completely discussed and gone through a use case in understanding the file precedence or the configuration precedence in our Splunk. That is our splunk cluster master which has generated four lines of configuration.

Now let us go to our one of the clients and see that is cluster slaves which have generated configuration based on the changes that we made as part of splunk web. It will be under the same location as cluster master, that is etc system localserver conf as you can see here. Also we have four lines of configuration starting from starting from line number 26 to 30. That is there is a configuration of adding a replication port so that the Splunk application listens on this port for the copy of data that has been received from other splunk indexer cluster members so that it can store a copy of data here.

Similarly, we have clustering stanza where it is pointing to the master node and the management port and it also specifies this mode as slave so that this is the indexer that holds some of the copies in our cluster. The same configuration will be generated as part of our other indexer. Also as you can see, we have a similar or identical configuration that has been generated on our indexer one.

So this confirms we have successfully configured our indexer cluster. Now we have configured our indexer cluster but we have still not yet specified which indexes should be replicated. As of now we have two indexes that are in the score audit and underscore internal. Let us make sure these two indexes are replicated. In order to do that, there is a configuration known as replication set for each indexes. If you go to one of the cluster peers, you’ll be able to see etc system default indexes. com. There will be replacement. As you can see, by default the replication factor is zero.

  1. Handson Indexer Clustering : part 05

As you can see, none of the indexes are replicated by default. So what we’ll do, we’ll go ahead and change this configuration to one or equal to auto. In order to do that, copy the syntax from our system default and edit a file under System local. We’ll be able to see this configuration as part of our deployment. That is high availability multicide clustering on our Amazon aws. So I’ll make it all the indexes replicated along the cluster.

I’ll change it to auto the same configuration. I’ll edit it under another splunk instance. Let me restart before I get into another instance. So we need to do the same changes in all the cluster members so that the automatic replication takes place. So that once the data comes into your indexer, it is replicated based on the replication factor value into the other indexes etc. System Localindexers. com so once the other instance is up, I think we have made a mistake that it is not global. It should be as it is.

Yes, so it should not be global, it is just the default configuration without any stanza. It is applicable to all the indexes. Let us wait for this instance to come up. Once it successfully up, we can start the restarting for our other instance of our cluster. These management of clusters is better if you have a deployment server because all this configuration you can push them in a matter of minutes by using deployment server.

We will see how this will reduce drastically when we are using deployment server when we are building our own enterprise level architecture of splunk in our Amazon servers. By the way, you can see the live status of your splunk instances using indexer cluster. As you can see, one of the instance status is shutting down. So once it is up, you’ll be able to see the data starting replication. So it is up. As you can see. Now we have a third index that is introspection. All these are internal index. We have not created any custom indexes.

We have created one or two but that should be deleted by now. By default, whenever you enable a cluster, the old data will not be replicated, only the new incoming data will be replicated. It is highly recommended to have the old data retire instead of copying it to clustered environment. Because copying non clustered data into clustered environment is time consuming, resource consuming and highly complex activity where it requires you to rename all the directories and also append them with guid change the file names which takes a mammoth task to bring in the old non clustered environment data into clustered environment.

It’s always a duty of Splunkard bin to architect to convey this message in case of any migration that is happening from non clustered environment to clustered environment that it is highly recommended not to move the large sums of data because it consumes loads of resources and lot of time in order to achieve this.

  1. Handson Multisite Indexer Clustering : Part 01

For configuring multisite index cluster. As of now, there is no Splunk web support properly. In the future version we will be getting the support for configuring multisite. As you can see. Let me log in to one of our splunk server. Here you’ll be able to see under indexer clustering tab you don’t have an option to specify which site it belongs to because the main configuration in multi site clustering is choosing the site value for your indexers and mentioning the which site it belongs to.

As you can see, it can only be master node and cluster label and password along with replication and search factor. But it doesn’t have any site information related to clustering. We’ll be configuring index cluster through the back end. Let me log out. I have logged into linux cli. So this is our cluster master which we saw in the Splunk web right now and this is our index number one, indexer number two. Similarly, I’ll place this in site one. I’ll place this in site two. For this, the configuration is relatively simple. The same as single site clustering will be under system local server on file which was the same file that was generated for our clustering.

As you can see, the previous cluster that we created is under disabled mode because I disabled it in order to demonstrate multisite clustering from our previous discussion. For multisite clustering, I already mentioned the site replication factor and the site search factor to override your replication factor and the search factor which is defined for single site clustering. So this is our cluster master. Here we will be deploying the cluster configuration which includes site replication factor and also site search factor. This will be our clustering configuration.

That is a total of six lines. We’ll go through them one by one. So here I’ve added a couple of lines for our configuration that is starting from general. Let me make it more clear. Multi site these configuration files I’ll be attaching as part of your course material. You can download this configuration and review them to understand better how this will reflect on the same configuration. You can copy and paste it in your environment with relatively changing search factor and replication factor depending on your organization requirement. It should totally work fine unless there is a version change that they have introduced. In part of newer questions, it might fail.

  1. Handson Multisite Indexer Clustering : Part 02

The first configuration in multisite clustering is defining which site this instance belongs to. I’ll say my cluster master belongs to site one. One more important thing. You can’t define site name as any other alpha numeric values. Let’s say I need to name my site as main site or Dr site or some other site data center. You can’t name the sites like this. You can name the sites as site one, site two, site three. Similarly, you can name them but not as any other values.

So I’ll define my cluster master as site one and mode. Since it is a cluster master, it will be master. This is the argument which specified it’s a multi site cluster. Just make it multisite is equal to true so that your cluster master will be aware that this is a multisite cluster. Whereas, as you can see, in our previous configuration of single side cluster, there was no multi site parameter that was mentioned here.

We will define out of this multi sites how many sites are available. For this demonstration we’ll be using site one, that is indexer one. Site two, that will be our indexer two. The next important configuration is site replication factor.

As I already said, the replication factor should be minimum to the number of indexes in the cluster. Here we have two indexes. So I’ll make sure one copy in each side. Origin is nothing but where the data is generated. So origin is one, total is two. You can also mention here for example simpler way, that is site one comma site two, that is site one holds one copy and site two holds one more copy.

This is the simplest way to define, but this is the best practice where you can mention how many original or origin data source copies should be present in each side so that it ration specified here. Wherever the data originates, keep one copy there and the remaining copy out of this total. Copy it to the other side. I’ll make the same configuration to our search factor. Always remember, while configuring multisite, it should be prefixed by site replication factor and site search factor.

If you define replication factor and search factor, those will not take effect because site replication factor and site search factor will be overridden. Save this restart your cluster master. Once this is restarted, you should be able to see indexer clustering screen as part of our cluster master. Let us log into our cluster master. Once you’re logged in, click on Settings indexer cluster.

 

img