How we made Haskell search strings as fast as Rust
Because we use the type for strings in Haskell, which is an array of UTF-16 code units in native endianness, the benchmarks would take UTF-16 data as input. We packed the 16-bit input code unit and the 32-bit go-to state together in a 64-bit integer, and now the transition table was a cache-friendly array of 64-bit integers. To stick with the earlier example, suppose we had the automaton set up like so, in pseudo-Python to illustrate the idea without too much noise:
Then we could step the automaton with a triply nested loop: one over the input characters, one to follow transitions, and one for the linear scan through the transition table.
Source: tech.channable.com