[...] it may happen in production – you will have 5-15 strings across 1M of objects.
Just be aware that for low cardinality of distinct values in many rows, and a non-unique index, you can have much better performance for some access patterns with ordered indices compared to hashed ones. We had such an issue, easily fixed by going to an ordered index. FWIW. --DD