ElasticSearch can be set up on a physical or virtual server depending on RAM, CPU and disk space.
Terms to Understand:
- Each instance we install on physical or virtual server is known as a node.
- Combination of node creates cluster.
- On node startup, the node searches for other cluster members and checks its index and shard status.
- Node configuration can be done through elasticsearch.yml
- Replication of node can be done using master node and secondary nodes
- If master node dies, a master-eligible node is elected to be the new master node.
Index (plural Indices)
- Index is main data container in node.
- In an index, data is grouped into data types called mappings.
- A mapping describes how the records are composed called field.
- Every record must be stored as a json object.
- To manage huge volume of records, it splits an index into multiple shards.
- Every record is stored in a shard. It’s algorithm is based on record ID.
- So to update any record, it hits only shard that contains your object.
The following scheme compares ElasticSearch structure with SQL and MongoDB:
|Object (JSON Object)||Record (Tuples)||Record (BSON Object)|
You need to download required ElasticSearch package from the following url: https://www.elastic.co/downloads/elasticsearch for your operating system. Once installation process completes, ElasticSearch will be ready to use. In this article I have used Ubuntu server.
ElasticSearch provides facility to install plugin, to make our work seamless. We will use Marvel for our testing rest apis. Using the following command we can install marvel in ElasticSearch.
sudo /usr/share/elasticsearch/bin/plugin -i elasticsearch/marvel/latest
After successful installation, we can use marvel by using this url: https://localhost:9200/_plugin/marvel
The default page looks like this.
Create with mappings:
Mapping helps search engine to map data. This also includes which fields are searchable and which are tokenized.
Create new document:
As shown in below image, create new document using post method. In url we must specify index and type here, library/books. On successful execution it returns created =true with its id. On creation it will set version as 1.
It is simple GET request with index, type and document id. It returns data of specified id.
On update of any document it’ll increase version counter. ElasticSearch api returns ‘created = true’ only when new data is inserted. When we update data that returns “created = false” but updates version number.
To delete any document just use delete request with index, type and document_id.
Search by all fields
Search by specific field
Delete Index :
How to integrate with PHP?
By using curl, you can use ElasticSearch with your favorite programming language. Here is the example of simple curl request with ElasticSearch.
PHP Sample Script:
You can find PHP client api on github: https://github.com/elastic/elasticsearch-php
<?php require './vendor/autoload.php'; $client = new ElasticsearchClient(); $info = $client->info(); print_r($info); //output Array ( [status] => 200 [name] => Steve Rogers [cluster_name] => elasticsearch [version] => Array ( [number] => 1.6.0 [build_hash] => cdd3ac4dde4f69524ec0a14de3828cb95bbb86d0 [build_timestamp] => 2015-06-09T13:36:34Z [build_snapshot] => [lucene_version] => 4.10.4 ) [tagline] => You Know, for Search )
Autosuggest functionality, ElasticSearch v/s MySQL.
By using this we want to check performance of both systems. Both are having 30 Lacs records. To improve ElasticSearch performance we have used filter, analyzer and to improve MySQL performance we have used full text search with optimized query. And at the end, output was quite amazing. Speed of ElasticSearch was half of speed of MySQL.