en_us_normalization.production.classify.AddressFst
- class en_us_normalization.production.classify.AddressFst[source]
Finite state transducer for classifying address. Address consists of multiple slots, most of which are optional. Those slots are:
house number - mandatory
street, consisting of name, type (road, street, square, etc), pre- or post- directional (N for north) - mandatory
suite - apartment or house number, consists of type and number (for ex. Apt #23) - optional
town - possibly multi-word town (for ex. San-Francisco) - optional
state - usually abbreviated state (for ex. CA) - optional
zip-code - 5-digit number with optional dash-separated 4-digits extension (for ex. 45149-3214). Another option for zip-code is british format zip code, such as “SW1W 0NY”. That one consitst of outcode and incode separated by space - optional
Examples of addresses and their parsing:
1599 Curabitur Rd. Bandera South Dakota 45149 -> address { house: “1599” street_name: “Curabitur” street_type: “road” town: “Bandera” state: “South Dakota” zip: “45149”}
123 N Malanyuka St. SE, Apt #23 San-Francisco CA 45149-3214 -> address { house: “123” pre_directional: “north” street_name: “Malanyuka” street_type: “street” post_directional: “south east” suite_type: “apartment” suite_number: “23” town: “San Francisco” state: “california” zip: “451493214”}