Flutter — handle Elasticsearch Queries with the Builder Pattern

devops terminal
9 min readOct 10, 2020
courtesy: from Photo by Littlehampton Bricks from Pexels

Foreword…

If you ever work with Elasticsearch before, you know that a specific Query syntax (named QueryDSL) would be employed to instruct how Elasticsearch works for us. QueryDSL is a human readable format expressed in JSON, typically it is quite straightforward and easy to write out a query to handle search requests; however… if we are supposed to handle dynamic query conditions… then we would very soon fall into a mud of miserables where we need to take care of lots of String level modifications and make sure the query works.

PS. We are going to build the solution on top of a demo app available at: https://gitlab.com/quoeamaster/blog_flutter_http_api.git

PS. If you are curious on what this demo app is about, do read this blog: https://medium.com/@quoeamaster/flutter-all-you-need-to-know-about-the-http-package-56de74e40ecb

Let’s start with an example~

The Challenge — hardcoded String Query

Based on the demo app, we send out a http GET request with a hardcoded String query to the target Elasticsearch cluster; the response returned is displayed on the TextField widget.

Let’s take a look at the eCommerce query:

Typically, it is verbose and contains different aspects on how Elasticsearch would handle our request (e.g. which fields are supposed to be available in the response through the “_source” config)

Hm… things look good, isn’t it? Indeed this hardcoded approach works fine as long as we are not looking into dynamic query conditions. Take an example, if this query is to get back a general overview of eCommerce activities for the landing page regardless of who the user is… then this hardcoded approach works perfectly fine~

But the reality is users would have their specific needs on searches, our application would need to cater for dynamic query conditions — at least let the users type in what they are looking for, right?

Good, we know the reasons behind supporting dynamic query conditions, then how should we get started? The simplest way to solve the problem is by modifying the query through if-then-else logics.

The if-then-else solution

Assume that our application now provided 2 dynamic options as follows:

  • choose the week of day for data filtering
  • filter data based on customer’s firstname
the new UI supporting week-of-day and firstname filtering

Both of these criteria are optional; however by submitting the corresponding values, the final results returned would be affected.

A quick solution spinning in our heads would be like this:

  • we would setup 2 variables to store the above values
  • based on the variables’ values, we could create the corresponding query parts
  • finally assemble the query parts together and form the query for search.

Indeed, setting up the corresponding variables for the dynamic options is 100% correct. Corresponding validation rules would also be applied to the variables at the same time (e.g week of day MUST range from 0~6).

Next step, building the query parts based on the variables’ value. This is where a heavy load of if-then-else appears. Typically, null-checks involve an “if”, validation-checks involve another set of “ifs”, singular vs array checks involve another “if”… plus many more. At first, the number of “ifs” is manageable, but when time flies and conditions increases, it starts to become a headache for maintainability. Just look at the lines of code for that “build” method and you would know what I mean :(

The final step in this solution is for assembling; again a number of “ifs” might be required to assemble only the valid parts together.

  • pros: if-then-else approach is the simplest to implement, not much design or refactoring required
  • cons: maintainability is the major downside of this solution, imagine maintaining some 10s of lines of code just to build a query

Alright, we are done with the if-then-else approach, what other options do we have in our pockets? The next approach would be the Builder Pattern.

Introducing the Builder Pattern solution

Before we move on, let’s understand what is a builder pattern. According to wikipedia

The builder pattern is a design pattern designed to provide a flexible solution to various object creation problems in object-oriented programming. The intent of the Builder design pattern is to separate the construction of a complex object from its representation. It is one of the Gang of Four design patterns.

Holy… what are you talking about??? :)

Yeah I know, it reads like a puzzle isn’t it? The main idea of a builder — separate the construction of a complex object from its representation. This is exactly what we need to improve the if-then-else approach! How to “separate”.

Also another important aspect on builder is we could build the final product (in our case the final query) step by step. For example, in a boolean query in which multiple criterium are available, we could focus on 1 of the criteria one at a time during the build. Therefore the complexity is reduced. You could treat each criteria to be its own method with a minimal sets of if-then-else logics (also validations). Usually a builder exposes a “build” method for assembling the final product, this method is lastly called when every condition is set / handled.

Let’s take a look at the following code snippet on how things work:

First of all, we need to setup an abstract class — QueryBuilder. This class acts as the base of any query-builders and provides methods to handle common settings for all query types. For example, it contains a method to handle “_source” config and another method to build the source’s query part.

Typically, when we need to handle a particular condition / field; we would expose 2 methods.

  • a setter method for setting the value for this condition / field (e.g. sourceInclude method)
  • a getter method for building the query part for a condition / field (e.g. buildSourcePartsOnly method)

1 important thing about the setter method is that it always return the builder’s current instance (i.e. “this” reference). This is SURPRISINGLY important as we can start chaining conditions / fields based on returning the “this” reference. (e.g. _builder.sourceInclude([“field_a”]).field(“firstname”, “john”).build(); returning the final query based on firstname containing john and returning only field_a in the response)

There are a few methods without implementations (e.g. build, buildQueryPartsOnly) and expected the child classes to provide the correct implementations based on the query type.

The above code snippet also included a child class named MatchQueryBuilder. As you can guess, this implementation is focusing on the Match Query. Note that each builder’s implementation should maintain its own condition / field logics, to a match-query-builder, it needs to handle the target “field” and its value setting; hence 2 variables are required

  • _field — the target field’s name in String, compulsory
  • _value — the field’s expected value for matching, compulsory

For match-query, we could also adjust the matching behaviour for multiple words from “or” to “and”, that’s why we would need to add 1 more variable

  • _operator — the multi word matching behaviour, default is “or”

To set the value for the target field, simply call the “field” method by providing the “field” and “value” parameter (default operator is assigned to “or” unless provided).

The corresponding abstract methods — build and buildQueryPartsOnly must be implemented as well.

Q. Hm… now what if one of the methods of this child class requires to invoke the parent’s method?

  • Let’s take a look a the “reset” method. For our match-query-builder, we just need to take care of the 3 parameters (_field, _value, _operator) but for sure we might also want to reset the _source values as well. So based on this principle, if we need to reset parameters that is out of our reach / scope; we should call the parent’s reset method instead. Simply super.reset() would trigger the parent’s reset method!

Q. Hey… but you never tell me how to set the parent’s _source parameter earlier??

  • Since the match-query-builder inherits every public method from its parent (QueryBuilder class), we simply call sourceInclude() directly! :) Power to Inheritance and Polymorphism~~

Now let’s take a look on the pros and cons:

  • pros: more flexible in building our final product (i.e. query), also more control. Another good point is the actual implementation on how to handle and build the query part is hidden from the user / caller; hence whenever the query syntax changes on Elasticsearch side, all we need to do is to update the implementation without troubling the user / caller.
  • cons: some efforts on code design is required — intermediate level of programming skills. Creating more classes / code instead of bare if-then-else.

Why not the Factory Pattern???

Our final question would be why adopting the builder pattern and not the factory pattern. Again, we need to know what is the factory pattern, by wikipedia:

In class-based programming, the factory method pattern is a creational pattern that uses factory methods to deal with the problem of creating objects without having to specify the exact class of the object that will be created. This is done by creating objects by calling a factory method — either specified in an interface and implemented by child classes, or implemented in a base class and optionally overridden by derived classes — rather than by calling a constructor.

I know… “what are you talking about again??” :) To be simple, factory pattern relies on a mechanism to build the correct implementation of a class for us. Usually a set of factory methods would be available for different purposes (e.g. buildJdbcDatasource() would create a JDBC connector whilst buildFileStore() would create a file based connector) OR sometimes we provide a key / identifier to instruct how the factory method would build the correct class instance (e.g. buildStoreByType(int typeID) would create the designated store instance based on the parameter typeID — typeID usually is a constant value).

Another important aspect is factory pattern / methods would build the instance for us at once. Comparing with the builder pattern, we could slowly build the final product bits by bits until we are ready to get the final product (query in our case).

Q. Hm… alright I know the differences now; but then… why can’t we use factory pattern as it sounds like a good choice too?

  • I would rather not say using factory pattern is inappropriate, however think about our situation. We might have a complex query to build such as a boolean-query, multiple criterium would be available (e.g. 2 must clause, 1 filter clause, 1 must_not clause and 3 should clause conditions). If we were using the factory pattern, then we would need to design the factory method to handle all these dynamic parameters. The simplest idea is to provide 4 parameters named — mustList, mustNotList, shouldList and filterList; now we can provide the 4 list of criteria every time we need a class instance. Caveat is … we might not need all the 4 clauses of criteria sometimes~ But in order to create an instance, we would need to provide all 4 Lists (even though maybe 3 out 4 lists are empty).
  • Another issue is what if… we would like to add back a missing criteria later on? Of course we could add back methods to handle such criteria updates — but then… it would kind of break the purpose in the 1st place, we are expecting a factory to build something that is already known or stable (usually all configurations are set during the factory method execution and the created instance provides only utility methods to handle operations instead of changing configurations)

That is why the builder pattern seems a more appropriate choice for our use case. On the other hand, if we are building another module / component as connector to Elasticsearch, the factory pattern would be an excellent choice~

PS. every design pattern has its strength and weakness, hence choose wisely base on your criteria that matches best on the corresponding pattern. :) Remember there is no silver bullet pattern to all cases.

closings

Great~ We have gone through a couple of things again :)

  • Elasticsearch Query is an expression written in JSON format. The contents included lots of things such as which fields to return, size of returned results, but most importantly the search criteria (which is usually dynamic based on user needs)
  • taming the Query in a pure String way — if-then-else approach, there is nothing wrong with if-then-else, just maintainability might be an issue when the codebase increases
  • builder pattern to the rescue, by breaking the query build logic into modules / components increase flexibility. Maintainability is less an issue at once plus logic changes are hidden from users / callers as long as the exposed methods remain intact
  • factory pattern?? A discussion on why factory pattern might not work in our situation. If we do have manageable and stable criteria for building the query, then it is possible to use a factory pattern too~

PS. reminder, if you want to access the code for the flutter project (including the builder pattern related classes) do take a look at https://gitlab.com/quoeamaster/blog_flutter_http_api.git

--

--

devops terminal

a java / golang / flutter developer, a big data scientist, a father :)