Tokenization is the act of breaking up a sequence of strings into pieces such as words, keywords, phrases and other elements called tokens. The tokens become the input for another process like parsing and text mining.