I'm curious how did you compare the libraries?
What was the test case?
Many tests for Points within only one Polygon?
If yes, then have you tried to first calculate the bounding box of a Polygon and use it for the first check to eliminate this difference between libraries?
Are the benchmarks available somewhere?

+ the usual suspect when it comes to templated libraries and performance => have you made sure to enable compiler optimizations.

Regards
Bruno