en_us_normalization.production.classify.ShorteningFst

class en_us_normalization.production.classify.ShorteningFst[source]

Finite state transducer for discovering shortenings, such as Mrs. or prof. All shortenings and their mappings are stored in:

  • shortenings/case_agnostic.tsv - shortenings that should be expanded for any case

  • shortenings/cased.tsv - shortenings that require precise writing as in the data file to be expanded

Shortenings are expanded immediately, so no need to separate verbalization or dedicated semiotic class.

Examples of input/output strings:

  • mrs. -> name: “misses”

__init__()[source]