At Local Measure we collect thousands events a day from different social networks. We use these events to calculate a variety of metrics. Some metrics are used to render graphs and charts on the Local Measure dashboard. Others are not visible to the users but are used internally. For example to send “heads up” notifications to our customers when something important happens in their Facebook feed.

We’ve been using MongoDB as our storage backend so when the time came to build a metrics system we decided to implement it on top of MongoDB. We had several main requirements for the metrics system. First, it had to be fast when dealing with time based data, since all of our metrics are time based and can be filtered by random time ranges. Second, metrics had to be easily accessible at different time resolutions: hourly, daily, monthly. This is how we build our histogram graphs. Taking all this into account we created a separate database in our MongoDB cluster with several collections and a bunch of sophisticated indexes. It worked fine in the beginning but soon as more features were being introduced we started to see drawbacks. It was inflexible when dealing with ad-hoc time intervals covering partial granularities (e.g. seeing the number of posts for several years starting at a random date in September and finishing at another random date in May). The indexes grew larger and started slowing the whole system down. All in all the maintenance overhead made us look elsewhere. We evaluated several options and concluded that elasticsearch suits all our needs.

In the developers world elasticsearch is known as a de-facto solution for full text search, therefore all other nice features are usually overlooked. And this is very unfortunate because it has a great aggregation framework called “facets”. It was developed to be able to perform statistical analysis of returned results on the fly. However with a touch of creativity facets can be used as a standalone feature for analysing data, stored in elasticsearch indexes.

Elasticsearch comes with a dozen of different facets. At Local Measure we mostly use “term” and “date histogram” facets. The latter is the driving force for our posts/check-ins graph. For example, when our customers want to see total check-ins for the last day, month or year, elasticsearch “date histogram” facet does all the job of aggregating numbers in “buckets”. The size of these “buckets” depends on the length of the time interval. This is the approximate query we use:

{    "facets": {        "check_ins_by_hour": {            "date_histogram": {                "key_field": "timestamp",                "value_field": "check_ins",                "interval": "1h"            },            "facet_filter": {                "bool": {                    "must": {                        "range": {                            "timestamp": {                                "gte": "2013-09-10T03:00:00",                                "lte": "2013-09-11T04:00:00"                            }                        }                    }                }            }        }    } }

Elasticsearch is quite fast when dealing with date based histograms. One of the uses of the “term” facet is calculating total number of posts for every social network for a certain time frame. The “term” in this case is the social network.

This is a similar query:

{    "facets": {        "total_posts_by_network": {            "terms_stats": {                "key_field": "social_network",                "value_field": "posts"            },            "facet_filter": { ... }        }    } }

We’ve done performance testing trying to see how slow elasticsearch becomes when indexes become large. Of course at some point it did slow down, but that did not happen as soon as it did with our previous MongoDB based system. However there is a solution for this too. Elasticsearch is expected to be running in a cluster so it comes with all goodies of contemporary cluster building craft. One of the most powerful features is sharding. Right now we have a cluster consisting of two elasticsearch instances with more instances ready to join as the load grows. We shard metrics indexes by social networks, so when a query comes in to calculate metrics for Facebook only, elasticsearch automatically routes this query to the instance responsible for Facebook. Only this instance does the calculations whereas others take care of similar queries for other social networks. This allows us to run calculations for several social networks in parallel.

So far we’ve been really happy with elasticsearch. We continue experimenting with it and in future we might use it for more ad-hoc numbers crunching.

– Roman Kalyakin, Software Engineer, Local Measure