I found myself needing some New York City detailed Zip Code information for another script I was creating. The zip codes themselves are easy enough to find online. I needed to include more details about each zip code location. I created a Perl script to merge two hard coded Perl data structures, which are printed out as a very basic JSON database file.
When creating Perl scripts with command line options, my go-to CPAN module is Getopt::Long. However for this script I will use MooX::Options, as I may extract some of the methods to be used in a future Moo module.
This will have three options, ‘create_zip_db’, ‘read_zip_db’ and ‘verbose’. The ‘doc’ attribute gives a brief description of each option. The ‘short’ attribute specifies any aliases that can be used for each option. The is ‘ro’ , means that the option value is immutable.
option create_zip_db => ( is => 'ro' , short => 'new_zipdb|new_zip' , doc => q/Create a new NYC Zip, Borough, District, Town JSON file./ , ); option read_zip_db => ( is => 'ro' , short => 'read_db' , doc => q/Read the NYC Zip file database./ , ); option verbose => ( is => 'ro' , doc => 'Print details' ); |
There are three Moo attributes. Some time in the future I can put these into a separate Moo module.
has db_dir => ( is => 'rw' , isa => Path, coerce => 1, default => sub { "$Bin/../db" } ); has zip_db_json_file => ( is => 'lazy' , isa => Path, builder => sub { $_ [0]->db_dir->child( "zip_db.json" ); } ); has zip_hash => ( is => 'lazy' , isa => sub { die "'zips_hash' must be a HASH" unless ( ref ( $_ [0] ) eq 'HASH' ) } , builder => sub { deserialize_file $_ [0]->zip_db_json_file; } ); |
The first attribute ‘db_dir’ specifies the future location of the JSON file. It uses Types::Path Tiny to enforce this directory path as a Path::Tiny object. The ‘zip_db_json_file’ is also a Types::Path::Tiny Path.
The ‘zip_hash’ is the data structure what will store the NYC Zip code, borough, district, town information. The ‘isa’ for this attribute will ensure that it is a Perl hash. The ‘deserialize_file’ function comes from the CPAN module, File::Serialize , which is very useful for dumping out Perl data structures to a JSON file, or in this case slurping in a JSON file to a Perl data structure. It also handles formats other than JSON.
Note that the ‘zip_hash’ attribute is ‘lazy’. I’m not saying that zip codes are particularly adverse to work. This is just Moo’s way of saying, “please don’t make me do anything until I really have to”. That way, resources are not nu-necessarily used creating a structure that isn’t being called for.
# Main sub run { my ( $self ) = @_ ; $self ->create_new_zipdb_file if $self ->create_zip_db; $self ->read_and_dump_the_db if $self ->read_zip_db; say "All Done!" if $self ->verbose; } main->new_with_options()->run; |
MooX::Options has it’s own particular style for creating a “Main” function that you won’t usually see in standard Perl scripts. It may be borrowed from brian d foy’s “Modulino” concept. Anyway, the script is invoked by:
The main ‘run’ function will call the methods as specified by the command line options.
To run this script from the command line.
# To get help λ perl bin\create_zipdb.pl -h USAGE: create_zipdb.pl [-h] [long options ...] --create_zip_db Create a new NYC Zip, Borough, District, Town JSON file . --read_zip_db Read the NYC Zip file database. --verbose Print details --usage show a short help message -h show a compact help message --help show a long help message -- man show the manual # Create a JSON file database λ perl bin\create_zipdb.pl --create_zip_db -- v # Read the database and dump to the terminal λ perl bin\create_zipdb.pl --read_zip_db |
Most of the actual work of reading in the hard coded data structure and creating/reading the JSON database file is done here:
sub create_new_zipdb_file { my $self = shift ; my $zip_boro_dist = $self ->get_raw_zip_data(); serialize_file $self -> zip_db_json_file => $zip_boro_dist ; say "Created a new " . $self ->zip_db_json_file if $self ->verbose; } sub get_raw_zip_data { my $self = shift ; my %zips_to_city = %{ _get_zips_to_city() }; my %bdz = %{ _get_borough_district_zips() }; my %zip_boro_dist ; for my $borough ( sort keys %bdz ) { my %district = %{ $bdz { $borough } }; for my $district_name ( sort keys %district ) { my @district_zips = @{ $district { $district_name } }; for my $zip ( sort @district_zips ) { my ( $city , $county ) = split /,/, $zips_to_city { $zip }; $county = $borough eq 'Brooklyn' ? 'Kings' : $borough eq 'Bronx' ? 'Bronx' : 'New York' unless $county ; $zip_boro_dist { $zip } = { borough => $borough , district => $district_name , city => $city , county => $county , }; } } } return \ %zip_boro_dist ; } sub read_and_dump_the_db { my $self = shift ; my $location_rec = $self ->zip_hash; dump $location_rec ; } |
Method ‘get_raw_zip_data’ grabs the two hard coded data structures and merges them. It makes a few little adjustments. It is called by ‘create_new_zipdb_file which uses the ‘serialize_file’ function from File::Serialize to dump the the Perl data structure in JSON format to the output JSON file.
Method ‘read_and_dump_the_db’ just reads this JSON file into the ‘zip_hash’ and dumps the contents to the console.
"10022" : { "borough" : "Manhattan" , "city" : "New York" , "county" : "New York" , "district" : "Gramercy Park and Murray Hill" }, "10023" : { "borough" : "Manhattan" , "city" : "New York" , "county" : "New York" , "district" : "Upper West Side" }, ... "10314" : { "borough" : "Staten Island" , "city" : "Staten Island" , "county" : "Richmond" , "district" : "Mid-Island" }, "10451" : { "borough" : "Bronx" , "city" : "Bronx" , "county" : "Bronx" , "district" : "High Bridge and Morrisania" }, ... "11426" : { "borough" : "Queens" , "city" : "Bellerose" , "county" : "Queens" , "district" : "Southeast Queens" }, "11427" : { "borough" : "Queens" , "city" : "Queens Village" , "county" : "Queens" , "district" : "Southeast Queens" }, "11428" : { "borough" : "Queens" , "city" : "Queens Village" , "county" : "Queens" , "district" : "Southeast Queens" }, |
The complete script can be found here create_zipdb.pl