camel_word_tokenize =================== About ----- The ``camel_word_tokenize`` tool splits words from punctuation while collapsing contiguous segments of spaces into a single whitespace character. It is also language agnostic and splits all characters marked as punctuation or symbols in the Unicode specification. For example the following sentence: .. code-block:: none Hello, world!!!! مرحبا يا عالم!!! becomes: .. code-block:: none Hello , world ! ! ! ! مرحبا يا عالم ! ! ! Usage ----- Below is the usage information that can be generated by running ``camel_word_tokenize --help``. .. code-block:: none Usage: camel_word_tokenize [-o OUTPUT | --output=OUTPUT] [FILE] camel_word_tokenize (-v | --version) camel_word_tokenize (-h | --help) Options: -o OUTPUT --output=OUTPUT Output file. If not specified, output will be printed to stdout. -h --help Show this screen. -v --version Show version.