Microsoft Bing Fire Tokenizer – 10x Faster Than NLTK

Microsoft Bing Fire Tokenizer – 10x Faster Than NLTK

Here we wanted to share with all of you our FInite State machine and REgular expression manipulation library (FIRE). Bling Fire Tokenizer is a tokenizer designed for fast-speed and quality tokenization of Natural Language text. Comparing Bling Fire with other popular NLP libraries, Bling Fire shows 10X faster speed in tokenization task

See more at benchmark wiki

To start using Bling Fire Library and Finite State Machine manipulation tools, you can build the project on Windows/Linux with CMake.

Source: github.com