camel_arclean

About

The camel_arclean utility cleans Arabic text by:

  • Deleting characters that are not in Arabic, ASCII, or Latin-1.
  • Converting all spacing characters to an ASCII space character.
  • Converting Indic digits into Arabic digits.
  • Converting extended Arabic letters into basic Arabic letters.
  • Converting 1-char presentation froms into simple basic forms.

Usage

Below is the usage information that can be generated by running camel_arclean --help.

Usage:
    camel_arclean [-o OUTPUT | --output=OUTPUT] [FILE]
    camel_arclean (-v | --version)
    camel_arclean (-h | --help)

Options:
  -o OUTPUT --output=OUTPUT
        Output file. If not specified, output will be printed to stdout.
  -h --help
        Show this screen.
  -v --version
        Show version.