Elasticsearch in Java Spring Boot: Starter Pack

Written by brilianfird | Published 2021/04/14
Tech Story Tags: java | elasticsearch | programming | backend | spring-boot | spring-framework | spring | software-development | web-monetization

TLDR Java and Elasticsearch are popular elements within common technology stacks that companies use. Many people and companies want to connect Java with Elasticsearch in order to develop their own search engine. In this article, we’ll use a spring-data-elasticsearch library provided by Spring Data. The easiest way to do this is to use the client library for Elasticsearch, which we can just add to our package manager (like Maven or Gradle) The first call is to delete the index if it already exists.via the TL;DR App

Both Java and Elasticsearch are popular elements within common technology stacks that companies use. Java is a programming language that was released back in 1996. Java is owned by Oracle and still in active development.
Elasticsearch is a young technology compared to Java — it was only released in 2010 (making it 14 years younger than Java). It’s gaining popularity quickly and is now used in many companies as a search engine.
Seeing how popular both are, many people and companies want to connect Java with Elasticsearch in order to develop their own search engine. In this article, I want to teach you how to connect Java Spring Boot 2 with Elasticsearch. We’ll learn how to create an API that’ll call Elasticsearch to produce results.

Connecting Java With Elasticsearch

The first thing we must do is connect our Spring Boot project with Elasticsearch. The easiest way to do this is to use the client library provided by Elasticsearch, which we can just add to our package manager (like Maven or Gradle).
For this article, we’ll use a spring-data-elasticsearch library provided by Spring Data, which also includes Elasticsearch’s High Level Client library.
Starting our project
Let’s start by creating our Spring Boot project with Spring Initialzr. I’ll configure my project to be like the picture below since we’re going to use a high-level client. Then we can use a convenient library provided by Spring, Spring Data Elasticsearch:
Adding the dependency to Spring Data Elasticsearch
If you followed my Spring Initialzr configuration in the previous section, then you should already have the Elasticsearch client dependency in your project. But if you don’t, you can add it:
<dependency>
  <groupId>org.springframework.boot</groupId>
  <artifactId>spring-boot-starter-data-elasticsearch</artifactId>
</dependency>
Creating the Elasticsearch client’s bean
There are two methods to initialize the bean — you can either use the beans defined in the Spring Data Elasticsearch library, or you can create your own bean.
The easier option is to use the bean configured by Spring Data Elasticsearch.
For example, you can add these properties into your application.properties:
spring.elasticsearch.rest.uris=localhost:9200
spring.elasticsearch.rest.connection-timeout=1s
spring.elasticsearch.rest.read-timeout=1m
spring.elasticsearch.rest.password=
spring.elasticsearch.rest.username=
The second method involves creating your own bean. You can configure the settings by creating the RestHighLevelClient bean. If the bean exists, Spring Data will use it as its configuration.
Testing the connection from our Spring Boot application to Elasticsearch
Your Spring Boot app and Elasticsearch should be connected now that you’ve configured the bean. Since we’re going to test the connection, make sure your Elasticsearch is up and running!
To test it, we can create a bean that’ll create an index in Elasticsearch in the DemoApplication.java. The class would look like:
OK, in that code we called Elasticsearch twice with the RestHighLevelClient, which we’ll learn later on in this article. The first call is to delete the index if it already exists. We used a try/catch that, because if the index, doesn’t exist. Then the elasticsearch will throw an error, failing our app’s starting process.
The second call is to create an index. Since I’m only running a single-node Elasticsearch, I configured the shards to be 1 and replicas to be 0.
If everything went fine, then you should see the indices when you check your Elasticsearch. To check it, just go to http://localhost:9200/_cat/indices?v, and you can see the list of the indices in your Elasticsearch:
Congrats! You just connect your application to the Elasticsearch!!

Other ways to connect

I recommend you use the spring-data-elasticsearch library if you want to connect to Elasticsearch with Java. But if you can’t use that library, there’s another way to connect your apps to Elasticsearch.
High Level Client
As we know from the previous section, the spring-data-elasticsearch library we used also includes Elasticsearch’s High Level Client. If you’ve already imported spring-data-elasticsearch, then you can already use Elasticsearch’s high-level client.
If you want to, it’s also possible to use the High Level Client library directly without Spring Data’s dependency. You just need to add this dependency in your dependency manager:
<dependency>
    <groupId>org.elasticsearch.client</groupId>
    <artifactId>elasticsearch-rest-high-level-client</artifactId>
    <version>8.0.0</version>
</dependency>
We’ll also use this client in our examples because the functionality in the High Level Client is more complete than that of spring-data-elasticsearch.
For more information, you can read the Elasticsearch documentation.
Low Level Client
You’ll have a harder time with this library, but you can customize it more. To use it, you can add the following dependency:
<dependency>
    <groupId>org.elasticsearch.client</groupId>
    <artifactId>elasticsearch-rest-client</artifactId>
    <version>8.0.0</version>
</dependency>
For more information, you can read Elasticsearch’s documentation about this.
Transport client
Elasticsearch also provides transport client, which will make your application identify as one of the nodes of Elasticsearch. I don’t recommend this method because it’ll be deprecated soon.
If you’re interested, you can read about the transport client here.
REST call
The last way to connect to Elasticsearch is by doing a REST call. Since Elasticsearch uses the REST API to connect to its client, you basically can use a REST call to connect your apps to Elasticsearch. You can use OkHttp, Feign, or your web client to connect your apps with Elasticsearch.
I also don’t recommend this method because it’s a hassle. Since Elasticsearch already provides client libraries, it’s better to use them instead. Only use this method if you don’t have any other way to connect.

Using Spring Data Elasticsearch

First, let’s learn how to use spring-data-elasticsearch in our Spring project. spring-data-elasticsearch is very easy to use and a high-level library we can use to access Elasticsearch.
Creating an entity and configuring our index
After you’re done connecting your apps with Elasticsearch, it’s time to create an entity! With Spring Data, we can add metadata to our entity, which will be read by the repository bean we created. This way, the code will be much cleaner and faster to develop since we won’t need to create any mapping logic in our service level.
Let’s create an entity called Product:
So let me explain what’s going on in the code block above. First, I won’t explain about @Data, @AllArgsConstructor, @NoArgsConstructor, and @Builder . They’re annotations from the Lombok library for constructor, getter, setter, builder, and other things.
Now, let’s talk about the first spring data annotation in the entity, @Document. The @Document annotation shows that the class is an entity containing metadata of the Elasticsearch index’s setup. To use the Spring Data repository, which we’ll learn later on, the @Document annotation is mandatory.
The only annotation that’s mandatory in @Document is the indexName. It should be pretty clear from the name — we should fill it with the index name we want to use for the entity. In this article, we’ll use the same name as the entity, product.
The second parameter of @Document to talk about is the createIndex parameter. If you set the createIndex as true, your apps will create an index automatically when you’re starting the apps if the index doesn’t yet exist.
The shards, replicas, and refreshInterval parameters determine the index settings when the index is created. If you change the value of those parameters after the index is already created, the settings won’t be applied. So the parameters will only be used when creating the index for the first time.
If you want to use a custom ID in Elasticsearch, you can use @Id annotations. If you use @Id annotations, Spring Data will tell Elasticsearch to store the ID in the document and the document source.
The @Field type will determine the field mapping of the field. Like shards, replicas, and refreshInterval, the @Field type will only affect Elasticsearch when first creating the index. If you add a new field or change types when the index is already created, it won’t do anything.
Now that we configured the entity, let’s try out the automatic index creation by Spring Data! When we configure the createIndex as true, Spring Data will check whether the index exists in Elasticsearch. If it doesn’t exist, Spring Data will create the index with the configuration we created in the entity.
Let’s start our app. After it’s running, let’s check the settings and see if it’s correct:
curl --request GET \
  --url http://localhost:9200/product/_settings
The result is:
The mappings are also as we expected. It’s the same as we configured in the entity class.

Basic CRUD with the Spring Data repository interface

After we’ve created the entity, we’ve everything we need to create a repository interface in Spring Boot. Let’s create a repository called ProductRepository.
When you’re creating an interface, make sure to extend ElasticsearchRepository<T, U>. In this case, the T object is your entity, and the U object type you want to use for the data ID. In our case, we’ll use the Product entity we created earlier as T and String as U .
Now that your repository interface is done, you don’t need to take care of the implementation because Spring is taking care of it. Now, you can call every function in the classes that your repository extends to.
For examples of CRUD, you can check the code below:
In the code blocks above, we created a service class called SpringDataProductServiceImpl, which is autowired to the ProductRepository we created before.
There are four basic CRUD function in it. The first one is createProduct, which, as its name implies, will create a new product in the product index. The second one, getProduct, gets the product we’ve indexed by its ID. The deleteProduct function can be used to delete the product in the index by ID.The insertBulk function will allow you to insert multiple products to Elasticsearch.
Everything’s done! I won’t write about the API testing in this article because I want to focus on how our apps can interact with Elasticsearch. But if you want to try the API, I left a GitHub link at the end of the article so you can clone and try this project.
Custom query methods in the Spring Data
In the previous section, we only took advantage of using the basic methods that are already defined in the other classes. But we can also create custom query methods to use.
What’s very convenient about Spring Data is you can make a method in the repository interface, and you don’t need to code any implementation. The Spring Data library will read the repository and automatically create the implementations for it.
Let’s try searching for products by the name field:
Yes, that’s all you need to do to create a function in the Spring Data repository interface.
You can also define a custom query with the @Query annotation and insert a JSON query in the parameters.
Both of the methods we’ve created do the same thing — use the match query with name as its parameter. If you try it, you’ll get the same results.

Using ElasticsearchRestTemplate

If you want to do a more advanced query, like aggregations, highlighting, or suggestions, you can use the ElasticsearchsearchRestTemplate provided by the Spring Data library. By using it, you can create your own query, making itas complex as you want.
For example, let’s create a function for doing a match query to the name field like before:
You should notice the code above is more complex than the one we defined in the ElasticserchRepository. It’s recommended to use the Spring Data repository if you can. But for a more advanced query like aggregation, highlighting, or suggestions, you must use the ElasticsearchRestTemplate.
For example, let’s write a bit of code that’ll aggregate a term:

Elasticsearch RestHighLevelClient

If you’re not using Spring or your Spring version doesn’t support spring-data-elasticsearch, you can use a Java library developed by Elasticsearch, RestHighLevelClient.
RestHighLevelClient is a library you can use to do basic things like CRUD or managing your Elasticsearch. Even though the name implies that it’s high level, it’s actually more low level when compared to spring-data-elasticsearch.
The advantage of this library over Spring Data is you can also manage your Elasticsearch with it. It provides index and Elasticsearch configuration, which you can use with more flexibility when compared to Spring Data. It also has more complete functionality to interact with Elasticsearch.
The disadvantage of this library over Spring Data is this library is more low level, which means you must code more.
CRUD with RestHighLevelClient
Let’s see how we can create a simple function with the library so we can compare it to the previous methods we’ve used:
As you can see, it’s now more complicated and harder to implement. Now, you need to handle the exception and also convert the JSON result to your entity. It’s recommended to use Spring Data instead for basic CRUD operations because RestHighLevelClient is more complicated.
I’ve included other CRUD functions in the GitHub project. If you’re interested, you can check it out. The link is at the end of this article.

Index creation

This section is where the RestHighLevelClient holds a clear advantage compared to Spring Data Elasticsearch. When we were creating an index, with its mappings and settings, in the previous section, we only used annotations. It’s very easy to do, but you can’t do much with it.
With RestHighLevelClient, you can create methods for index management or basically almost anything that the Elasticsearch REST API allows.
For example, let’s write some code that’ll create the product index with the settings and mappings we used before:
So let’s see what we did in the code:
  1. We initialized the createIndexRequest when also determining the index name.
  2. We added the settings in the request when calling createIndexRequest.settings. In the settings, we also configured the field index.requests.cache.enable, which isn’t possible with the Spring Data library.
  3. We made a Map containing the properties and mappings of the fields in the index.
  4. We called Elasticsearch with restHighlevelClient.indices.create.
As you can see, with the RestHighLevelClient, we can create a more customized call for creating an index for Elasticsearch when compared to the annotations in the Spring Data entity. There’s also more functionality in the RestHighLevelClient that doesn’t exist in the Spring Data library. You can read Elasticsearch’s documentation for more information about the library.

Conclusion

In this article, we’ve learned two ways to connect to Elasticsearch: using Spring Data and through Elasticsearch client. Both are powerful libraries, but you should only use Spring Data if it’s possible for your use case. The code with Spring Data Elasticsearch is more readable and easier to use.
If you want a more powerful library that can basically do anything Elasticsearch allows, though, then you can also use the Elasticsearch High Level Client. You can also use the Low Level Client, which we didn’t cover in this article, if you need even more powerful features.
I’d also like to say thank you for reading this article, and I hope this article is helping you to get started with Elasticsearch in Java Spring Boot. If you want to learn more about the libraries, you can check out the Spring Data Elasticsearch documentation and Elasticsearch’s High Level Client documentation.
Below is the GitHub link for the GitHub project used for this article:
Previously published at codecurated.com.

Written by brilianfird | A Software Engineer based in Indonesia.
Published by HackerNoon on 2021/04/14