There are in general 2 different scenarios when it comes to indexing. Either you have to deal with a stream of data, like logs, Twitter Stream, newsfeeds etc. or you have nightly database dumps. There might be cases where you have both nightly database dumps… Read more
All posts tagged “Elasticsearch”
Named Entity Annotations in Elasticsearch
This blogpost will show how you can use Elasticsearch to extract Named Entities and store them as annotations. There is a really nice plugin written by one of the main Elasticsearch developers Alexander Reelsen: https://github.com/spinscale/elasticsearch-ingest-opennlp This plugin wraps the library OpenNLP and allows to extract… Read more
When simple is better: The boolean similarity module
I had a lecture about Information Retrieval at university. That’s the field that studies search engines. In the first few classes we learned about the history and evolution of language models that are used for search engines. The most basic and simple form of a… Read more
How to build a self-learning search engine with Elasticsearch
This blogpost will walk you through a demo that shows how you can use Elasticsearch to build a self-learning search engine. You can apply this technique if you have a user facing UI and if you can access the webanalytics that tracks the user-interaction with… Read more
Test-Driven Relevance Tuning of Elasticsearch using the Ranking Evaluation API
This blog post is written for engineers that are always looking for ways to improve the result sets of their search application built on Elasticsearch. The goal of this post is to raise awareness of why you should care about relevance, what components are involved… Read more
How to use ElasticSearch for Natural Language Processing and Text Mining — Part 2
Welcome to Part 2 of How to use Elasticsearch for Natural Language Processing and Text Mining. It’s been some time since Part 1, so you might want to brush up on the basics before getting started. This time we’ll focus on one very important type… Read more
Text Classification made easy with Elasticsearch
Elasticsearch is widely used as a search and analytics engine. Its capabilities as a text mining API are not as well known. In the following article I’d like to show how text classification can be done with Elasticsearch. With a background in computational linguistics and… Read more
How to use ElasticSearch for Natural Language Processing and Text Mining — Part 1
ElasticSearch is a search engine and an analytics platform. But it offers many features that are useful for standard Natural Language Processing and Text Mining tasks. Read more…
Statistical aggregations on numeric object array fields
When working with statistical aggregations in ElasticSearch 1.7 I couldn’t find any documentation about how arrays are treated. Of course you need a numeric field for statistical aggregations. In my special case I needed arrays of objects. But this should obviously not make a difference.… Read more
Introducing a generic dynamic mapping template for ElasticSearch
Configuring a mapping for ElasticSearch is not required. Per definition and as opposed to Solr, ElasticSearch is schemaless. If not defined, a mapping for a type is created on the fly, based on the first document that is being indexed. If another document that is… Read more