en_us_normalization.production.classify.ShorteningFst
- class en_us_normalization.production.classify.ShorteningFst[source]
Finite state transducer for discovering shortenings, such as Mrs. or prof. All shortenings and their mappings are stored in:
shortenings/case_agnostic.tsv - shortenings that should be expanded for any case
shortenings/cased.tsv - shortenings that require precise writing as in the data file to be expanded
Shortenings are expanded immediately, so no need to separate verbalization or dedicated semiotic class.
Examples of input/output strings:
mrs. -> name: “misses”