I found myself needing some New York City detailed Zip Code information for another script I was creating. The zip codes themselves are easy enough to find online. I needed to include more details about each zip code location.  I created a Perl script to merge two hard coded Perl data structures, which are printed out as a very basic JSON database file.

When creating Perl scripts with command line options, my go-to CPAN module is Getopt::Long. However for this script I will use MooX::Options, as I may extract some of the methods to be used in a future Moo module.

This will have three options, ‘create_zip_db’, ‘read_zip_db’  and ‘verbose’. The ‘doc’ attribute gives a brief description of each option. The ‘short’ attribute specifies any aliases that can be used for each option. The is ‘ro’ , means that the option value is immutable.

option create_zip_db => (
    is    => 'ro',
    short => 'new_zipdb|new_zip',
    doc   => q/Create a new NYC Zip, Borough, District, Town JSON file./,
);

option read_zip_db => (
    is    => 'ro',
    short => 'read_db',
    doc   => q/Read the NYC Zip file database./,
);

option verbose => ( is => 'ro', doc => 'Print details' );

There are three Moo attributes.  Some time in the future I can put these into a separate Moo module.

has db_dir => (
    is      => 'rw',
    isa     => Path,
    coerce  => 1,
    default => sub { "$Bin/../db" }
);

has zip_db_json_file => (
    is      => 'lazy',
    isa     => Path,
    builder => sub {
        $_[0]->db_dir->child("zip_db.json");
    }
);

has zip_hash => (
    is => 'lazy',
    isa =>
      sub { die "'zips_hash' must be a HASH" unless ( ref( $_[0] ) eq 'HASH' ) }
    ,
    builder => sub {
        deserialize_file $_[0]->zip_db_json_file;
    }
);

The first attribute ‘db_dir’ specifies the future location of the JSON file. It uses Types::Path Tiny   to enforce this directory path as a Path::Tiny  object. The ‘zip_db_json_file’ is also a Types::Path::Tiny Path.

The ‘zip_hash’ is the data structure what will store the NYC Zip code, borough, district, town information. The ‘isa’ for this attribute will ensure that it is a Perl hash.  The ‘deserialize_file’  function comes from the CPAN module, File::Serialize , which is very useful for dumping out Perl data structures to a JSON file, or in this case slurping in a JSON file to a Perl data structure. It also handles formats other than JSON.

Note that the ‘zip_hash’ attribute is ‘lazy’.  I’m not saying that zip codes are particularly adverse to work. This is just Moo’s way of saying, “please don’t make me do anything until I really have to”.  That way, resources are not nu-necessarily used creating a structure that isn’t being called for. 

# Main
sub run {
    my ($self) = @_;
    $self->create_new_zipdb_file if $self->create_zip_db;
    $self->read_and_dump_the_db  if $self->read_zip_db;
    say "All Done!"              if $self->verbose;
}
main->new_with_options()->run;

MooX::Options has it’s own particular style for creating a “Main” function that you won’t usually see in standard Perl scripts. It may be borrowed from brian d foy’s “Modulino” concept. Anyway, the script is invoked by:

main->new_with_options()->run;

The main ‘run’ function will call the methods as specified by the command line options.

To run this script from the command line.

# To get help
λ perl bin\create_zipdb.pl -h
USAGE: create_zipdb.pl [-h] [long options ...]

    --create_zip_db  Create a new NYC Zip, Borough, District, Town JSON
                     file.
    --read_zip_db    Read the NYC Zip file database.
    --verbose        Print details

    --usage          show a short help message
    -h               show a compact help message
    --help           show a long help message
    --man            show the manual

# Create a JSON file database
λ perl bin\create_zipdb.pl --create_zip_db --v

# Read the database and dump to the terminal
λ perl bin\create_zipdb.pl --read_zip_db

Most of the actual work of reading in the hard coded data structure and creating/reading the JSON database file is done here:

sub create_new_zipdb_file {
    my $self          = shift;
    my $zip_boro_dist = $self->get_raw_zip_data();
    serialize_file $self->zip_db_json_file => $zip_boro_dist;
    say "Created a new " . $self->zip_db_json_file if $self->verbose;
}

sub get_raw_zip_data {
    my $self         = shift;
    my %zips_to_city = %{ _get_zips_to_city() };
    my %bdz          = %{ _get_borough_district_zips() };
    my %zip_boro_dist;
    for my $borough ( sort keys %bdz ) {
        my %district = %{ $bdz{$borough} };
        for my $district_name ( sort keys %district ) {
            my @district_zips = @{ $district{$district_name} };
            for my $zip ( sort @district_zips ) {
                my ( $city, $county ) = split /,/, $zips_to_city{$zip};
                $county =
                    $borough eq 'Brooklyn' ? 'Kings'
                  : $borough eq 'Bronx'    ? 'Bronx'
                  : 'New York'
                  unless $county;

                $zip_boro_dist{$zip} = {
                    borough  => $borough,
                    district => $district_name,
                    city     => $city,
                    county   => $county,
                };
            }
        }
    }
    return \%zip_boro_dist;
}

sub read_and_dump_the_db {
    my $self         = shift;
    my $location_rec = $self->zip_hash;
    dump $location_rec;
}

Method ‘get_raw_zip_data’ grabs the two hard coded data structures and merges them. It makes a few little adjustments.  It is called by ‘create_new_zipdb_file which uses the ‘serialize_file’ function from  File::Serialize to dump the the Perl data structure in JSON format to the output JSON file.

Method ‘read_and_dump_the_db’ just reads this JSON file into the ‘zip_hash’ and dumps the contents to the console.

   "10022" : {
      "borough" : "Manhattan",
      "city" : "New York",
      "county" : "New York",
      "district" : "Gramercy Park and Murray Hill"
   },
   "10023" : {
      "borough" : "Manhattan",
      "city" : "New York",
      "county" : "New York",
      "district" : "Upper West Side"
   },
   ...
     "10314" : {
      "borough" : "Staten Island",
      "city" : "Staten Island",
      "county" : "Richmond",
      "district" : "Mid-Island"
   },
   "10451" : {
      "borough" : "Bronx",
      "city" : "Bronx",
      "county" : "Bronx",
      "district" : "High Bridge and Morrisania"
   },
   ...
  "11426" : {
      "borough" : "Queens",
      "city" : "Bellerose",
      "county" : "Queens",
      "district" : "Southeast Queens"
   },
   "11427" : {
      "borough" : "Queens",
      "city" : "Queens Village",
      "county" : "Queens",
      "district" : "Southeast Queens"
   },
   "11428" : {
      "borough" : "Queens",
      "city" : "Queens Village",
      "county" : "Queens",
      "district" : "Southeast Queens"
   },

The complete script can be found here create_zipdb.pl

Leave a Reply

Your email address will not be published. Required fields are marked *

Protected by WP Anti Spam