en_us_normalization.production.verbalize.VerbatimFst
- class en_us_normalization.production.verbalize.VerbatimFst(cardinal: Optional[CardinalFst] = None)[source]
Finite state transducer for verbalizing verbatim, i.e. any leftovers after classification into semiotic classes. Verbatim verbalization is the last effort. If it comes to it, likely existing semiotic classes require expansion. Strategy for verbatim verbalization:
sequences of letters are spelled letter by letter (i.e. converted to upper case for pronunciation generation)
digits are pronounced digit by digit
know symbols (“%” or “&”) are converted to spoken form (“percent” or “ampersand”)
unknown, non-ascii symbols are dropped (we should avoid it as much as possible!)
Example of input/output strings:
verbatim|name:sa12| -> SA one two
- __init__(cardinal: Optional[CardinalFst] = None)[source]
constructor of verbatim verbalizer
- Parameters
- cardinal: CardinalFst
verbalizer of cardinal numbers to reuse for numbers expansion