What are segments in Lucene?

The Lucene index is split into smaller chunks called segments. Each segment is its own index. Lucene searches all of them in sequence. A new segment is created when a new writer is opened and when a writer commits or is closed. The advantages of using this system are that you never have to modify … Read more

Elasticsearch always returning “mapping type is missing”

Turns out this is happening because the mapping needs to be applied to the type: I tried applying it to the wrong thing: curl -XPUT 10.160.86.134:9200/products/_mapping -d ‘{ It needs to be applied to the type like so: curl -XPUT 10.160.86.134:9200/products/product/_mapping -d ‘{ It’s sad that a simple google search couldn’t answer this. Also the … Read more

Why are document stores like Lucene / Solr not included in NoSQL conversations?

I once listened to an interview with author Ursula K. LeGuin about fiction writing. The interviewer asked her about authors who work in different genre of writing. What makes one author a romance writer, and another a mystery writer, and another a science fiction writer? LeGuin responded by explaining: Genre is about marketing, not about … Read more

How does Lucene work

Lucene is an inverted full-text index. This means that it takes all the documents, splits them into words, and then builds an index for each word. Since the index is an exact string-match, unordered, it can be extremely fast. Hypothetically, an SQL unordered index on a varchar field could be just as fast, and in … Read more