camel_tools.morphology.database¶
The MorphologyDB
class parses a morphology database file and generates
indexes to be used by the analyzer, generator, and reinflector components.
You will never have to access MorphologyDB
instances directly but only
pass them as arguments when creating new instances of the analyzer, generator,
and reinflector components.
Classes¶
-
class
camel_tools.morphology.database.
MorphologyDB
(fpath, flags='a')¶ Class providing indexes from a given morphology database file.
Parameters: - fpath (
str
) – File path to database. - flags (
str
) – Flag string (similar to opening files) indicates what components the database will be used for. ‘a’ indicates analysis, ‘g’ indicates generation, and ‘r’ indicates indicates reinflection. ‘r’ is equivalent to ‘ag’ since the reinflector uses both analyzer and generator components internally. Defaults to ‘a’.
Raises: InvalidDatabaseFlagError
– When an invalid flag value is given.-
all_feats
()¶ Return a set of all features provided by this database instance.
Returns: The set all features provided by this database instance. Return type: frozenset
ofstr
-
static
builtin_db
(db_name=None, flags='a')¶ Create a
MorphologyDB
instance from one of the builtin databases provided.Parameters: - db_name (
str
, optional) – Name of builtin database. You can uselist_builtin_dbs()
to get a list of builtin databases or see Databases. Defaults to ‘calima-msa-r13’. - flags (
str
, optional) – Flag string to be passed toMorphologyDB
constructor. Defaults to ‘a’.
Returns: Instance of builtin database with given flags.
Return type: - db_name (
-
static
list_builtin_dbs
()¶ Returns a list of builtin databases provided with CAMeL Tools.
Returns: List of builtin databases. Return type: list
ofDatasetEntry
- fpath (
Databases¶
Below is a list of databases that ship with CAMeL Tools:
Examples¶
from camel_tools.morphology.database import MorphologyDB
# Initialize the default database ('calima-msa-r13')
db = MorphologyDB.builtin_db()
# In the above call, the database is loaded for analysis only by defaut.
# This is equivalent to writing:
db = MorphologyDB.builtin_db(flags='a')
# We can load it for generation as so:
db = MorphologyDB.builtin_db(flags='g')
# Or for reinflection as so:
db = MorphologyDB.builtin_db(flags='r')
# Since reinflection uses the database in both analysis and generation modes
# internally, the above is equivalent to writing:
db = MorphologyDB.builtin_db(flags='ag')
# We can initialize other builtin databases by providing the name of the
# desired database. In the examples above, we loaded the default database
# 'calima-msa-r13'. We can load other builtin databases by providing the
# desired databases name. Here we'll load the builtin Egyptian database,
# 'calima-egy-r13':
db = MorphologyDB.builtin_db('calima-egy-r13')
# Or with flags:
db = MorphologyDB.builtin_db('calima-egy-r13', flags='r')
# We can also initialize external databases:
db = MorphologyDB('/path/to/database')
# or with flags:
db = MorphologyDB('/path/to/database', flags='g')
Footnotes
[1] | calima-msa-r13 is a modified version of the almor-msa-r13.db database that ships with MADAMIRA. The calima-msa-r13 database is distributed under the GNU General Public License version 2. |
[2] | calima-egy-r13 is a modified version of the almor-cra07.db database that ships with MADAMIRA. The calima-egy-r13 database is distributed under the GNU General Public License version 2. |
[3] | calima-glf-01 database is distributed under the the Creative Commons Attribution 4.0 International License. |