Question: Please use python code. This looks long but it is sime. I have 4 parts project and I have the code for first 2 parts.

Please use python code. This looks long but it is sime. I have 4 parts project and I have the code for first 2 parts. I need help with part 3 only.

Part 1 Bhattacharyya distance

import math def bhatt_dist(D1, D2, n): BCSum = 0 for i in range(n): BCSum += math.sqrt(D1[i] * D2[i]) DBValue = - math.log(BCSum) return DBValue D1 = [1 , 2, 1, 2] D2 = [ 59 ,100, 41, 100] print bhatt_dist(D1, D2, 2)

Output: -3.08297735276

Part 2 lename and creates a single letter frequency distribution

import string

fp=open('sample.txt','r')

#fp=open('file1.txt','r') also used

#fp=open('file2.txt','r') also used

file_name=fp.readlines()

frequency = {}

for l in file_name:

l = filter(lambda x: x in string.letters, l.lower())

for char in l:

if char in frequency:

frequency[char] += 1

else:

frequency[char] = 1

print frequency

Output:

file 1 = {'a': 14819, 'c': 14947, 'b': 14899, 'e': 15106, 'd': 14908, 'g': 14828, 'f': 14854, 'i': 14984, 'h': 14949, 'k': 14725, 'j': 14917, 'm': 15174, 'l': 14821, 'o': 15033, 'n': 14906, 'q': 14849, 'p': 15093, 's': 14771, 'r': 14749, 'u': 14899, 't': 14946, 'w': 14880, 'v': 14709, 'y': 14938, 'x': 14899, 'z': 14808}

file2 = {'a': 8899, 'c': 27723, 'b': 9912, 'e': 16075, 'd': 26786, 'g': 51184, 'f': 9394, 'i': 9639, 'h': 36746, 'k': 6884, 'j': 24261, 'm': 5714, 'l': 25631, 'o': 7135, 'n': 466, 'q': 29605, 'p': 154, 's': 2370, 'r': 16460, 'u': 6589, 't': 24447, 'w': 26625, 'v': 340, 'y': 289, 'x': 10385, 'z': 3698}

sample = {'a': 39604, 'c': 10644, 'b': 7685, 'e': 57726, 'd': 22713, 'g': 9695, 'f': 11819, 'i': 33433, 'h': 31290, 'k': 3428, 'j': 369, 'm': 13803, 'l': 17229, 'o': 38778, 'n': 32060, 'q': 314, 'p': 7482, 's': 27463, 'r': 25718, 'u': 13159, 't': 46758, 'w': 13395, 'v': 5140, 'y': 10020, 'x': 592, 'z': 223}

part 3 Statistical Analysis of Files

Write a program that takes two lenames as inputs. The output of the program should be the Bhattacharyya distance between the single letter frequency distributions resulting from each of the les, respectively. Note that to implement this, you will need to use the two functions dened in the rst two challenges.

You have three les in your folder: sample.txt, file1.txt, and file2.txt. The le sample.txt is a large sample of writing in English which you will use to build a statistical prole for letter distribution in the English language. Of the other two les, one is an encryption of a document written in English and one is a random collection of letters.

To see what all this is good for, run the Bhattarchaya distance function on the three text les provided to your group. Find the distance between sample.txt and file1.txt, and the distance between sample.txt and file2.txt. In your written report, write down the results of these two comparisons and analyze what these results mean. Exactly one of these les is an encryption of a document written in English. In your report, explain which le is the le written in English and justify your reasoning.

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer

Step: 1 Unlock blur-text-image

Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock

Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!

Augmented Binary Heaps Please use python 2.7 parts of the code have been provided just do the parts that are need please provide functional working code and screenshots for good feedback thanks!...

Please use python code ############################################################################## # Tic-tac-toe (or noughts and crosses) is a simple strategy game in which two players # take...

please use python and pygames Description For this assignment you will write a simple game where the object is to hit a stack of blocks with a ball, and knock all the blocks in the stack....

Dino's International Doughnut Shoppe is looking for a new text-based ordering system to help speed up customer shopping. Throughout the three parts of this assignment you will build a successively...

Use Python Code only. Please help me with this project.I have spent two days but still can't figure out what to do. I need codes in python 3.3 or 3.6 only. you can dis regard C language. I am posting...

please use python An Adjustable Rate Mortgage 1 Introduction An adjustable rate mortgage (ARM) is a mortgage loan where the interest rate is adjusted from time to time based on some index (such as...

please use python programming language to solve the question. ICT133 Tutor-Marked Assignment ICT133 Tutor-Marked Assignment Question 2 Respond to the menu choices according to the description below....

Please use python spyder Assignment Specifications You will be developing a python program that implements the two password cracking methods and allows the user to crack .zip (archive) files. The...

please use Python to do this thank you very much RSS Overview Many websites have content that is updated on an unpredictable schedule. News sites, such as Google News, are a good example of this. One...

Discuss some of the basic functions performed by the bill of lading.

A coin jar contains nickels, dimes, and quarters. There are 47 coins in all. There are 12 more nickels than quarters. The value of the dimes is $3.70 less than the value of the quarters. How many...

Does target account for share repurchases as ( a ) treasury stock

Assets Ingraham Lloyd Cash $ 2 1 0 , 0 0 0 $ 1 6 5 , 0 0 0 Accts / Rec 4 6 0 , 0 0 0 5 7 0 , 0 0 0 Inventory 1 4 0 , 0 0 0 2 3 5 , 0 0 0 Current assets 8 1 0 , 0 0 0 9 7 0 , 0 0 0 Net fixed assets 1...

=+ Harry: We should examine whether our companys productivitygallons of potion per workerwould rise or fall.

=+ Who do you think is right? Why?

=+the amount he or she receives in Social Security benefits is typically reduced.