Question: Homwork 10 CISP 430 Assignment 10 Spring 2018 Part 1 Implementation: Hashing with Collision Resolution Write a program to read protein sequences from a file,
CISP 430 Assignment 10 Spring 2018 Part 1 Implementation: Hashing with Collision Resolution Write a program to read protein sequences from a file, count them, and allow for retrieval of any single protein sequence. Read in proteins and store them in a hash table. You do not know what the proteins are ahead of time (pretend that the input dataset may change). So you will have to resolve collisions. The input file is very large, but somehow you happen to know that each protein will be less than 30 amino acids long so you can store them in a 30 character string. You also know that the file contains many copies of less than 20 unique proteins, so, you can use a data array with 40 elements which is twice as much space as you need, to reduce the number of collisions. Each element will contain the key value itself (the protein), and the number of times it occurs in the input file (the count). Use the following data structure: struct arrayelement ( char protein[301 int count; arrayelement proteins[401 The hash function is: h(key)-( first-letter-of-key + (2 last-letter-of-key) ) % 40 where, A-0, B 1, , Z # 25. Generate output of the form: Protein BIKFPLVHANQHVDNSVRWGIKDW AWGKKKTKTOFQFPTADANCDCDD Count 5929 7865 Etc for all of them.. I Please enter a sequence: AWGKKKTKTOFQFPTADANCDCDD Please enter a sequence: LADYGAGABORNTHISWAY 7865 FOUND NOT FOUND // The file processing algorithm While (there are proteins) Read in a protein Hash the initial index into the proteins table While (forever) If (found key in table) Increment count Break; If (found empty spot in table)
Step by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
