Install and secure ElasticSearch 1.x on Digital Ocean

This blogpost describes how to install and secure ElasticSearch on a Digital Ocean Virtual Root Server.

Setting up the server

First of all you need to create a droplet on Digital Ocean. Therefore you have to register:

https://www.digitalocean.com/

You can pay with your creditcard or with Paypal.

To start you can choose the 10$ option which comes with 1GB RAM and 30GB SSD drive. As an operating system I chose Ubuntu 14.04. But you can choose whatever you’re comfortable with.

Then you have to choose the region. Choose one that is near to you to minimize network latency.
If you choose the option “Private Network” you will, once you have multiple droplets in the same region, let them communicate without having the traffic going through the public internet.

Which will be very convenient if you want to set up a multi-machine cluster and enable unicast later. For now let’s start with one node.

The registration and setup process might take maybe 5 minutes.

Configuring the server

Once you’re done the ssh-credentials will be sent to your e-mail account.
After first logging in, you will be asked to change your password. You will be logged in as root.

Install the JDK

You can install the OpenJDK or the OracleJDK. Simon Willnauer, one of the founders of ElasticSearch, recommends to use the OracleJDK. Because they tend to be more reliable when it comes to updates. Anyway, I was lazy and since OracleJDK is not in the Ubuntu repositories, I’m going for OpenJDK which I used mainly in the past few months.

[code language=”bash”]
$ apt-get update
$ apt-get install openjdk-7-jdk
[/code]

Then you need to set JAVA_HOME. Usually you’re doing that in /home/.bashrc. But on Digital Ocean that file is located in /etc/bash.bashrc. Following this guide this saved you from googleling it and saved the energy that one bulb consumes per year 🙂

At the end of the file just enter:

[code language=”bash”]
export JAVA_HOME=/usr/lib/jvm/java-7-openjdk
[/code]

Then you need to refresh that file:

[code language=”bash”]
$ source /etc/bash.bashrc
[/code]

Installing ElasticSearch

Go to the ElasticSearch homepage and copy the download link for the latest version.

http://www.elasticsearch.org/overview/elkdownloads/

Then go to whatever directory you like, for example /home and download the zipped file.

[code language=”bash”]
$ wget ‘https://download.elasticsearch.org/elasticsearch/elasticsearch/elasticsearch-1.3.4.zip’
[/code]

Then unzip it:

[code language=”bash”]
$ apt-get install unzip
$ unzip elasticsearch-1.3.4.zip
[/code]

Now enter your ElasticSearch home directory.

[code language=”bash”]
$ cd elasticsearch-1.3.4
[/code]

Configuring ElasticSearch

Now you need to edit the config file in config/elasticsearch.yml
Useful settings that you should change from the very beginning are:

Cluster name: just set it to something else. Per default all nodes within the same network will form a cluster, if they have the same cluster name and multicast is enabled. We will change that, but you might need it in the future and then you should just know what else you can do to not involuntarily exchange data with others.
mlockall: Do not enable that if you have only 1GB of RAM. I tried it and ElasticSearch was not even able to start. So just leave the default.
Disable multicast discovery: For the reasons I just explained you above. This is the default feature most people complain about. But the ElasticSearch developers want to keep it as it is, because ElasticSearch is supposed to scale out of the box. Anyway. If you’re running it locally and other people in your local network run it too with the same default cluster name you will very soon wonder about indices you never created on your machine. It’scary and funny at the same time. And usually that’s when people learn about multicast first and start thinking about the configuration.

The other default features are ok to start with. If your data will grow slowly and you don’t have to start with 1TB. But then you will probably not read this blog.

Next have a look at the startup script: bin/elasticsearch.in.sh which is called when you start ElasticSearch by typing bin/elasticsearch.

In the first few lines you can set the min and max Java heap size.
The default max-value is set to 1GB. Since you only have 1GB memory available on your machine, you might lower that a bit, depending on how many other applications you plan to run on this server. Just edit that file, set max-heap-size to 800m or so, and the close it.

Securing ElasticSearch

Now we could start our ElasticSearch server. But what could happen? Per default ElasticSearch runs on port 9200 and is open for everyone who knows your IP-address.
So people could put there some data, or steal your data, without much effort. Since ElasticSearch is becoming increasingly popular, hackers are trying to take advantage of those easy-to-use and ready-to-go default values and of the fact that ElasticSearch doesn’t have any builtin security. It’s open and easily accessible. So you should take a look at this blogpost to exploit different options on how to secure your cluster:

http://brudtkuhl.com/securing-elasticsearch/

I’ll explain you one of those options in the following.

Installing the Http-Basic plugin

The http-basic plugin that can be used for simple authentication is hosted on github:

https://github.com/Asquera/elasticsearch-http-basic

To install it just create a directory called plugins in your ElasticSearch Home folder.

[code language=”bash”]
$ cd plugins
$ bin/plugin –url ‘https://github.com/Asquera/elasticsearch-http-basic/releases/download/1.3.2/elasticsearch-http-basic-1.3.2.jar’ –install http-basic
[/code]

After that you have to edit your elasticsearch.yml file again.
In the network section just add:

[code language=”bash”]
http.basic.log: true
http.basic.user: “some_user”
http.basic.password: “some_password”
[/code]

And then you’re finally ready to start your cluster.
If you try to access it via http with your browser:

[code language=”bash”]
http://178.62.209.??:9200/_search?q=test&pretty=1
[/code]

Then you should be asked to login.
That’s it. You’re ElasticSearch is secured.

Have fun with ElasticSearch!

Saskia Vola

Textmining, NLP and Elasticsearch consulting