Question: Project 1 Project 1 Details Commands Sample Input Command File Output Testing Files Compiling Requirements Grading How to Tar How To Read Tab Delimited Data
Project 1
Project 1
Details
Commands
Sample Input Command File
Output
Testing
Files
Compiling
Requirements
Grading
How to Tar
How To Read Tab Delimited Data
Project 1
Project shifts our attention to the remaining two ways in which we structure our programs, loops and functions. We are going to be reading a potentially complex data file and processing it line by line for each of our commands. Essentially we will be creating a lookup program that will search for data and tell us if its been found.
Details
For this program we will use data files from a website in tab delimited format. There will be a video in the media on reading tab delimited data. The data is from the same website but it is not the same data. You should watch those video if youre not sure how to read the data. The video wont show you exactly how to read this data, but rather give you the general idea of how to read all tab delimited data. Youll have to do the work to get our data read in.
Below is the beginning of one of the files well be using:
US 99553 Akutan Alaska AK Aleutians East 013 54.143 -165.7854 1 US 99571 Cold Bay Alaska AK Aleutians East 013 55.1858 -162.7211 1 US 99583 False Pass Alaska AK Aleutians East 013 54.841 -163.4368 1 US 99612 King Cove Alaska AK Aleutians East 013 55.0628 -162.3056 1 US 99661 Sand Point Alaska AK Aleutians East 013 55.3192 -160.4914 1 US 99546 Adak Alaska AK Aleutians West (CA) 016 51.88 -176.6581 1 US 99547 Atka Alaska AK Aleutians West (CA) 016 52.1224 -174.4301 1 US 99591 Saint George Island Alaska AK Aleutians West (CA) 016 56.5944 -169.6186 1 US 99638 Nikolski Alaska AK Aleutians West (CA) 016 52.9883 -168.7884 1 US 99660 Saint Paul Island Alaska AK Aleutians West (CA) 016 57.1842 -170.2764 1
So that is the first 10 lines from US.txt. That data represents the zip code information for Alaska, or at least part of it. Here is how the data is represented in the file:
The data format is tab-delimited text in utf8 encoding, with the following fields:
country code : iso country code, 2 characters postal code : varchar(20) place name : varchar(180) admin name1 : 1. order subdivision (state) varchar(100) admin code1 : 1. order subdivision (state) varchar(20) admin name2 : 2. order subdivision (county/province) varchar(100) admin code2 : 2. order subdivision (county/province) varchar(20) admin name3 : 3. order subdivision (community) varchar(100) admin code3 : 3. order subdivision (community) varchar(20) latitude : estimated latitude (wgs84) longitude : estimated longitude (wgs84) accuracy : accuracy of lat/lng from 1=estimated to 6=centroid
That is from the readme that is found on this site and at the bottom of this page: http://download.geonames.org/export/zip/Links to an external site. You may get the original data from that link as well. I wont be using all the data found there, but I did get 6 different files. For each of the data fields I would use string data type for them, except for the latitude and longitude since we need to do math with them. See the commands below for details on that.
For the data we will have 4 different commands. The basic commands are as follows: place, distance, county, and postal. There is also a command that gives us the filename for the data file.
Commands
filename - when the filename command is given it will be followed by a filename to use. This command may be issued more than once and if it is issued a second, or third, or more times, and subsequent commands will use the new filename for the data file until filename is issued again.
postal - this command will be followed by a postal code. You will use the data file that was given by the filename command and search that file sequentially. If the given postal code is found, then the information about that entry is displayed or printed into the output. If it is not found, then a message to that extent is given. See below for sample output. This is a complete match, meaning the entire postal code must match or its not a match; e.g. do doesnt match dog and cat doesnt match kittycat.
county - this command will be followed by a county name, at least thats what we call it in the US. In the data, the county is the admin name 2. You will use the data file that was given by the filename command and search that file sequentially. If the given county is found, then the information about that entry is displayed or printed into the output. If it is not found, then a message to that extent is given. See below for sample output. This is a partial match, meaning the admin name 2 is searched and if the given county appears in the admin name 2 then its a match; e.g. do does match dog and cat does match kittycat as well as frog matches frog. See the strings documentation for an idea of how to do this.
place - this command will be followed by a place name. You will use the data file that was given by the filename command and search that file sequentially. If the given place name is found, then the information about that entry is displayed or printed into the output. If it is not found, then a message to that extent is given. See below for sample output. This is a partial match, meaning the place name is searched and if the given place name appears in the place name then its a match; e.g. do does match dog and cat does match kittycat as well as frog matches frog. See the strings find method for an idea of how to do this.
distance - this command will be followed by two postal codes. You will use the data file that was given by the filename command and search that file sequentially. If you find both of the postal codes, then you will compute the distance between them using their latitude and longitude coordinates and the Haversine Formula . Note: if there are more than 1 appearance of the given postal code, then we will be using the first appearance to get the latitude and longitude. See below for sample output. This is a complete match, meaning the entire postal code must match or its not a match; e.g. do doesnt match dog and cat doesnt match kittycat.
Info on the Haversine FormulaLinks to an external site.
Sample Input Command File
filename US.txt postal 24060 postal 00000 place Blacksburg county Bib place I Can't Find You! county Doesn't Exist Co. distance 24060 24073 filename GB.txt postal IM7 distance IM7 XYZ
Below will be the corresponding output. Both positive and negative responses will be shown below.
Output
Heres what the output would look like for the given sample input file:
filename US.txt postal 24060 US 24060 Blacksburg Virginia VA Montgomery 121 37.2563 -80.4347 4 postal 00000 Sorry, postal code 00000 was not found. place Blacksburg US 29702 Blacksburg South Carolina SC Cherokee 021 35.1095 -81.4942 4 US 24060 Blacksburg Virginia VA Montgomery 121 37.2563 -80.4347 4 US 24061 Blacksburg Virginia VA Montgomery 121 37.1791 -80.3515 4 US 24062 Blacksburg Virginia VA Montgomery 121 37.2296 -80.4139 4 US 24063 Blacksburg Virginia VA Montgomery 121 37.2296 -80.4139 4 county Bib US 35034 Brent Alabama AL Bibb 007 32.9357 -87.2114 4 US 35035 Brierfield Alabama AL Bibb 007 33.0427 -86.9517 4 US 35042 Centreville Alabama AL Bibb 007 32.9503 -87.1192 4 US 35074 Green Pond Alabama AL Bibb 007 33.2027 -87.118 4 US 35184 West Blocton Alabama AL Bibb 007 33.1424 -87.1369 4 US 35188 Woodstock Alabama AL Bibb 007 33.2068 -87.15 4 US 36792 Randolph Alabama AL Bibb 007 32.8888 -86.907 4 US 36793 Lawley Alabama AL Bibb 007 32.8646 -86.9567 4 US 31052 Lizella Georgia GA Bibb 021 32.7773 -83.825 4 US 31201 Macon Georgia GA Bibb 021 32.8095 -83.6168 4 US 31202 Macon Georgia GA Bibb 021 32.8407 -83.6324 4 US 31203 Macon Georgia GA Bibb 021 32.8067 -83.6913 4 US 31204 Macon Georgia GA Bibb 021 32.8424 -83.6766 4 US 31205 Macon Georgia GA Bibb 021 32.8067 -83.6913 4 US 31206 Macon Georgia GA Bibb 021 32.7914 -83.679 4 US 31207 Macon Georgia GA Bibb 021 32.8304 -83.6486 4 US 31208 Macon Georgia GA Bibb 021 32.8067 -83.6913 4 US 31209 Macon Georgia GA Bibb 021 32.8067 -83.6913 4 US 31210 Macon Georgia GA Bibb 021 32.8926 -83.7455 4 US 31211 Macon Georgia GA Bibb 021 32.8869 -83.6021 4 US 31213 Macon Georgia GA Bibb 021 32.8393 -83.6388 4 US 31216 Macon Georgia GA Bibb 021 32.7486 -83.7477 4 US 31217 Macon Georgia GA Bibb 021 32.8118 -83.565 4 US 31220 Macon Georgia GA Bibb 021 32.8595 -83.802 4 US 31221 Macon Georgia GA Bibb 021 32.8407 -83.6324 4 US 31294 Macon Georgia GA Bibb 021 32.8407 -83.6324 4 US 31295 Macon Georgia GA Bibb 021 32.8102 -83.569 4 US 31296 Macon Georgia GA Bibb 021 32.8067 -83.6913 4 US 31297 Macon Georgia GA Bibb 021 32.7004 -83.6572 4 place I Can't Find You! Sorry, place name I Can't Find You! was not found. county Doesn't Exist Co. Sorry, county Doesn't Exist Co. was not found. distance 24060 24073 US 24060 Blacksburg Virginia VA Montgomery 121 37.2563 -80.4347 4 US 24073 Christiansburg Virginia VA Montgomery 121 37.1353 -80.4188 4 Distance: 13.5281 km filename GB.txt postal IM7 GB IM7 Port e Vullen Isle of Man IOM 54.2294 -4.2683 3 GB IM7 Ballaugh Isle of Man IOM 54.3119 -4.5446 4 GB IM7 Bride Isle of Man IOM 54.3826 -4.3892 4 GB IM7 Ballasalla Isle of Man IOM 54.0959 -4.6296 4 GB IM7 Andreas Isle of Man IOM 54.3667 -4.4333 4 GB IM7 Cranstal Isle of Man IOM 54.3956 -4.3695 4 GB IM7 Maughold Isle of Man IOM 54.2988 -4.3184 4 GB IM7 Glentruan Isle of Man IOM 54.2294 -4.2683 3 GB IM7 Jurby West Isle of Man IOM 54.2294 -4.2683 3 GB IM7 Smeale Isle of Man IOM 54.2294 -4.2683 3 GB IM7 Dhowin Isle of Man IOM 54.2294 -4.2683 3 GB IM7 The Lhen Isle of Man IOM 54.2294 -4.2683 3 GB IM7 Jurby East Isle of Man IOM 54.2294 -4.2683 3 GB IM7 Regaby Isle of Man IOM 54.2294 -4.2683 3 GB IM7 Dreemskerry Isle of Man IOM 54.2294 -4.2683 3 GB IM7 Dhoon Isle of Man IOM 54.2294 -4.2683 3 GB IM7 St Judes Isle of Man IOM 54.2294 -4.2683 3 GB IM7 Ravensdale Isle of Man IOM 54.2294 -4.2683 3 GB IM7 Corrany Isle of Man IOM 54.2294 -4.2683 3 GB IM7 Sulby Isle of Man IOM 54.3228 -4.4797 4 GB IM7 Churchtown Isle of Man IOM 54.3815 -4.4273 4 GB IM7 Dhoor Isle of Man IOM 54.2294 -4.2683 3 GB IM7 Glen Auldyn Isle of Man IOM 54.2294 -4.2683 3 GB IM7 Ballajora Isle of Man IOM 54.2294 -4.2683 3 GB IM7 Sandygate Isle of Man IOM 54.2294 -4.2683 3 GB IM7 Crawyn Isle of Man IOM 54.2294 -4.2683 3 GB IM7 The Cronk Isle of Man IOM 54.2294 -4.2683 3 GB IM7 Sartfield Isle of Man IOM 54.2294 -4.2683 3 distance IM7 XYZ GB IM7 Ballajora Isle of Man IOM 54.229400 -4.268300 3 Sorry, postal code XYZ was not found.
You can see that there is potential for a lot of output to be generated so it s important that you have the correct formatting, especially for the error messages.
Each command output is started with an echo of the command. That is done for two reasons:
So you know where you program is getting to if it gets stuck.
If this were a program that users were using, theyd know if the results for finding or not finding were based on the correct input. So echoing the command is good practice when you are receiving input.
After the echo 1 of two things will happen.
If the search was successful, i.e. what they were asking for is found, then the record is output. Every field is output, including any blank fields, the easiest way to do this is to make sure you read them all in.
If the search was not successful, then an message indicating that it wasnt found is output. Heres a sample for the postal, place and county commands:
Sorry, postal code 00000 was not found. Sorry, place name I Can't Find You! was not found. Sorry, county Doesn't Exist Co. was not found.
You can see they all have the same format.
Here is a sample for when 1 of the postal codes isnt found. You can see it at the end of the larger sample above.
distance IM7 XYZ GB IM7 Port e Vullen Isle of Man IOM 54.2294 -4.2683 3 Sorry, postal code XYZ was not found.
So IM7 was found and so we see the record information for it, but XYZ wasnt found and so we see the message about that and no distance is computed. If either or both postal codes are not found, then we would have an output for each search.
Testing
For testing, think about how you can create input files for making sure you code can both find and not find postal codes, place names, county (admin name 2 ), and either or both of the postal codes. Think about how your code would react if no filename was given? What should it do?
We will provide sample input and output files. You should use these for your testing as well. Use the diff command or the file comparison tool within VSCode to compare your results with the expected provided results.
use c thank you.
Step by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
