camel_transliterate =================== About ----- The ``camel_transliterate`` tool allows you to transliterate text from one form to another using one of the builtin transliteration schemes. It also allows tokens to be prefixed with a marker to indicate that they should not be transliterated. Usage ----- Below is the usage information that can be generated by running ``camel_transliterate --help``. .. code-block:: none Usage: camel_transliterate (-s SCHEME | --scheme=SCHEME) [-m MARKER | --marker=MARKER] [-I | --ignore-markers] [-S | --strip-markers] [-o OUTPUT | --output=OUTPUT] [FILE] camel_transliterate (-l | --list) camel_transliterate (-v | --version) camel_transliterate (-h | --help) Options: -s SCHEME --scheme Scheme used for transliteration. -o OUTPUT --output=OUTPUT Output file. If not specified, output will be printed to stdout. -m MARKER --marker=MARKER Marker used to prefix tokens not to be transliterated. [default: @@IGNORE@@] -I --ignore-markers Transliterate marked words as well. -S --strip-markers Remove markers in output. -l --list Show a list of available transliteration schemes. -h --help Show this screen. -v --version Show version. Below is a list of currently available transliteration schemes. .. code-block:: none ar2bw Arabic to Buckwalter ar2safebw Arabic to Safe Buckwalter ar2xmlbw Arabic to XML Buckwalter ar2hsb Arabic to Habash-Soudi-Buckwalter bw2ar Buckwalter to Arabic bw2safebw Buckwalter to Safe Buckwalter bw2xmlbw Buckwalter to XML Buckwalter bw2hsb Buckwalter to Habash-Soudi-Buckwalter safebw2ar Safe Buckwalter to Arabic safebw2bw Safe Buckwalter to Buckwalter safebw2xmlbw Safe Buckwalter to XML Buckwalter safebw2hsb Safe Buckwalter to Habash-Soudi-Buckwalter xmlbw2ar XML Buckwalter to Arabic xmlbw2bw XML Buckwalter to Buckwalter xmlbw2safebw XML Buckwalter to Safe Buckwalter xmlbw2hsb XML Buckwalter to Habash-Soudi-Buckwalter hsb2ar Habash-Soudi-Buckwalter to Arabic hsb2bw Habash-Soudi-Buckwalter to Buckwalter hsb2safebw Habash-Soudi-Buckwalter to Safe Buckwalter hsb2xmlbw Habash-Soudi-Buckwalter to Habash-Soudi-Buckwalter Notes on markers ---------------- A marker a string with no whitespace characters at the beginning, middle, or end of it (in otherwords, it's a single token without padding spaces). As a rule-of-thumb pick a marker that is not-likely to appear in your text. We use ``@@IGNORE@@`` as a default value, while some Arabic NLP tools use ``@@LAT@@`` to denote latin/foreign text. Notes on schemes ---------------- The transliteration schemes ``ar2bw``\ , ``ar2safebw``\ , ``ar2xmlbw``\ , ``ar2hsb``\ , ``bw2ar``\ , ``bw2safebw``\ , ``bw2xmlbw``\ , ``bw2hsb``\ , ``safebw2ar``\ , ``safebw2bw``\ , ``safebw2xmlbw``\ , ``safebw2hsb``\ , ``xmlbw2ar``\ , ``xmlbw2bw``\ , ``xmlbw2safebw``\ , ``xmlbw2hsb``\ , ``hsb2ar``\ , ``hsb2bw``\ , ``hsb2safebw``\ , and ``hsb2xmlbw``\ , use the conversion table listed in :doc:`../reference/encoding_schemes`. Input characters not listed in the conversion table are output as they appear without any transliteration.