I found myself needing some New York City detailed Zip Code information for another script I was creating. The zip codes themselves are easy enough to find online. I needed to include more details about each zip code location. I created a Perl script to merge two hard coded Perl data structures, which are printed out as a very basic JSON database file.
When creating Perl scripts with command line options, my go-to CPAN module is Getopt::Long. However for this script I will use MooX::Options, as I may extract some of the methods to be used in a future Moo module.
This will have three options, ‘create_zip_db’, ‘read_zip_db’ and ‘verbose’. The ‘doc’ attribute gives a brief description of each option. The ‘short’ attribute specifies any aliases that can be used for each option. The is ‘ro’ , means that the option value is immutable.
option create_zip_db => (
is => 'ro',
short => 'new_zipdb|new_zip',
doc => q/Create a new NYC Zip, Borough, District, Town JSON file./,
);
option read_zip_db => (
is => 'ro',
short => 'read_db',
doc => q/Read the NYC Zip file database./,
);
option verbose => ( is => 'ro', doc => 'Print details' );
There are three Moo attributes. Some time in the future I can put these into a separate Moo module.
has db_dir => (
is => 'rw',
isa => Path,
coerce => 1,
default => sub { "$Bin/../db" }
);
has zip_db_json_file => (
is => 'lazy',
isa => Path,
builder => sub {
$_[0]->db_dir->child("zip_db.json");
}
);
has zip_hash => (
is => 'lazy',
isa =>
sub { die "'zips_hash' must be a HASH" unless ( ref( $_[0] ) eq 'HASH' ) }
,
builder => sub {
deserialize_file $_[0]->zip_db_json_file;
}
);
The first attribute ‘db_dir’ specifies the future location of the JSON file. It uses Types::Path Tiny to enforce this directory path as a Path::Tiny object. The ‘zip_db_json_file’ is also a Types::Path::Tiny Path.
The ‘zip_hash’ is the data structure what will store the NYC Zip code, borough, district, town information. The ‘isa’ for this attribute will ensure that it is a Perl hash. The ‘deserialize_file’ function comes from the CPAN module, File::Serialize , which is very useful for dumping out Perl data structures to a JSON file, or in this case slurping in a JSON file to a Perl data structure. It also handles formats other than JSON.
Note that the ‘zip_hash’ attribute is ‘lazy’. I’m not saying that zip codes are particularly adverse to work. This is just Moo’s way of saying, “please don’t make me do anything until I really have to”. That way, resources are not nu-necessarily used creating a structure that isn’t being called for.
# Main
sub run {
my ($self) = @_;
$self->create_new_zipdb_file if $self->create_zip_db;
$self->read_and_dump_the_db if $self->read_zip_db;
say "All Done!" if $self->verbose;
}
main->new_with_options()->run;
MooX::Options has it’s own particular style for creating a “Main” function that you won’t usually see in standard Perl scripts. It may be borrowed from brian d foy’s “Modulino” concept. Anyway, the script is invoked by:
main->new_with_options()->run;
The main ‘run’ function will call the methods as specified by the command line options.
To run this script from the command line.
# To get help
λ perl bin\create_zipdb.pl -h
USAGE: create_zipdb.pl [-h] [long options ...]
--create_zip_db Create a new NYC Zip, Borough, District, Town JSON
file.
--read_zip_db Read the NYC Zip file database.
--verbose Print details
--usage show a short help message
-h show a compact help message
--help show a long help message
--man show the manual
# Create a JSON file database
λ perl bin\create_zipdb.pl --create_zip_db --v
# Read the database and dump to the terminal
λ perl bin\create_zipdb.pl --read_zip_db
Most of the actual work of reading in the hard coded data structure and creating/reading the JSON database file is done here:
sub create_new_zipdb_file {
my $self = shift;
my $zip_boro_dist = $self->get_raw_zip_data();
serialize_file $self->zip_db_json_file => $zip_boro_dist;
say "Created a new " . $self->zip_db_json_file if $self->verbose;
}
sub get_raw_zip_data {
my $self = shift;
my %zips_to_city = %{ _get_zips_to_city() };
my %bdz = %{ _get_borough_district_zips() };
my %zip_boro_dist;
for my $borough ( sort keys %bdz ) {
my %district = %{ $bdz{$borough} };
for my $district_name ( sort keys %district ) {
my @district_zips = @{ $district{$district_name} };
for my $zip ( sort @district_zips ) {
my ( $city, $county ) = split /,/, $zips_to_city{$zip};
$county =
$borough eq 'Brooklyn' ? 'Kings'
: $borough eq 'Bronx' ? 'Bronx'
: 'New York'
unless $county;
$zip_boro_dist{$zip} = {
borough => $borough,
district => $district_name,
city => $city,
county => $county,
};
}
}
}
return \%zip_boro_dist;
}
sub read_and_dump_the_db {
my $self = shift;
my $location_rec = $self->zip_hash;
dump $location_rec;
}
Method ‘get_raw_zip_data’ grabs the two hard coded data structures and merges them. It makes a few little adjustments. It is called by ‘create_new_zipdb_file which uses the ‘serialize_file’ function from File::Serialize to dump the the Perl data structure in JSON format to the output JSON file.
Method ‘read_and_dump_the_db’ just reads this JSON file into the ‘zip_hash’ and dumps the contents to the console.
"10022" : {
"borough" : "Manhattan",
"city" : "New York",
"county" : "New York",
"district" : "Gramercy Park and Murray Hill"
},
"10023" : {
"borough" : "Manhattan",
"city" : "New York",
"county" : "New York",
"district" : "Upper West Side"
},
...
"10314" : {
"borough" : "Staten Island",
"city" : "Staten Island",
"county" : "Richmond",
"district" : "Mid-Island"
},
"10451" : {
"borough" : "Bronx",
"city" : "Bronx",
"county" : "Bronx",
"district" : "High Bridge and Morrisania"
},
...
"11426" : {
"borough" : "Queens",
"city" : "Bellerose",
"county" : "Queens",
"district" : "Southeast Queens"
},
"11427" : {
"borough" : "Queens",
"city" : "Queens Village",
"county" : "Queens",
"district" : "Southeast Queens"
},
"11428" : {
"borough" : "Queens",
"city" : "Queens Village",
"county" : "Queens",
"district" : "Southeast Queens"
},
The complete script can be found here create_zipdb.pl