An index can help even on low cardinality fields if:
-
When one of possible values is very infrequent compared to the other values and you search for it.
For instance, there are very few color blind women, so this query:
SELECT * FROM color_blind_people WHERE gender="F"
would most probably benefit from an index on
gender
. -
When the values tend to be grouped in the table order:
SELECT * FROM records_from_2008 WHERE year = 2010 LIMIT 1
Though there are only
3
distinct years here, records with earlier years are most probably added first so very many records would have to be scanned prior to returning the first2010
record if not for the index. -
When you need
ORDER BY / LIMIT
:SELECT * FROM people ORDER BY gender, id LIMIT 1
Without the index, a
filesort
would be required. Though it’s somewhat optimized do to theLIMIT
, it would still need a full table scan. -
When the index covers all fields used in the query:
CREATE INDEX (low_cardinality_record, value) SELECT SUM(value) FROM mytable WHERE low_cardinality_record = 3
-
When you need
DISTINCT
:SELECT DISTINCT color FROM tshirts
MySQL
will useINDEX FOR GROUP-BY
, and if you have few colors, this query will be instant even with millions of records.This is an example of a scenario when the index on a low cardinality field is more efficient than that on a high cardinality field.
Note that if DML
performance is not much on an issue, then it’s safe to create the index.
If optimizer thinks that the index is inefficient, the index just will not be used.