en_us_normalization.production.classify.ShorteningFst

class en_us_normalization.production.classify.ShorteningFst[source]

Finite state transducer for discovering shortenings, such as Mrs. or prof. All shortenings and their mappings are stored in:

shortenings/case_agnostic.tsv - shortenings that should be expanded for any case
shortenings/cased.tsv - shortenings that require precise writing as in the data file to be expanded

Shortenings are expanded immediately, so no need to separate verbalization or dedicated semiotic class.

Examples of input/output strings: