en_us_normalization.production.classify.DecimalFst

class en_us_normalization.production.classify.DecimalFst(cardinal: Optional[CardinalFst] = None)[source]

Finite state transducer for classifying decimal, i.e. numbers with fractional part. There are 3 options to accept in fst:

  • both integer and fractional part are present, for ex. “12.5006”

  • only fractional part is present, for ex. “.35”

  • only integer part is present, for ex. “12”. This one can be handled by cardinal semiotic class, but it is kept in decimal as well, since decimal can be a part of composite semiotic class, such as measure

Integer part of decimal - can be any cardinal or a single “0” for cases such as “0.5” Fractional part can be any sequence of digits after the dot

Optionally decimal can have quantity after the number. There are two options: full form (for ex. “12 thousands”) or short version (for ex. “12k”). Supported quantities are stored in data/magnitudes.tsv

Examples for decimals and their tagging:

  • -12.5006 -> decimal { negative: “true” integer_part: “12” fractional_part: “5006” }

  • 13k -> decimal { integer_part: “13” quantity: “thousands” }

TODO: add handling of abbreviated quantities, for ex. .5B -> decimal { fractional_part: “5” quantity: “billion” }

__init__(cardinal: Optional[CardinalFst] = None)[source]

constructor for decimal fst

Parameters
cardinal: CardinalFst

a cardinal fst to reuse digits fst from it. If not provided, will be initialized from scratch.

static add_quantity(fst: pynini.FstLike, extra_quantity: Optional[pynini.FstLike] = None) pynini.FstLike[source]

helper function to add optional quantity field on top of the graph

get_basic_decimal_fst()[source]

getter for reusable basic decimal digits fst, that transduces “12.56” to integer_part: “12” fractional_part: “56”. I.e. before adding decimal tag and without quantity.