Question: We wish to separate this into smaller strings by splitting it at commas. We will then coerce each resulting token into the appropriate data type

We wish to separate this into smaller strings by splitting it at commas. We will then coerce each resulting token into the

appropriate data type according to the following schema:

test

_

schema

= [

int

,

int, str

,

str

,

float

]

The function call process

_

line

(

test

_

line, test

_

schema,

',')

should return the following list:

[6, 174,

'Blah', 'Hello World',

7.37]

Write a function named process

_

line

()

that accepts three parameters line, schema, and sep as described above.

The function should perform the following steps:

1 .

Use the split

()

method to split the string line on the character provided by sep. Store the resulting list in a

variable named tokens.

2 .

Create an empty list named result.

3 .

Simultaneously loop over the elements of the lists tokens and schema

(

which are intended to have the same

number of elements

) .

Each time the loop executes, perform the following steps:

.

Store the current element of tokens in a variable named t

.

.

Store the current element of schema in a variable named dt

.

.

Note that dt will contains a data type, while t will hold a string. Coerce t into a value with the desired

data type using dt

(

) .

Append the converted value into the list result.

4 .

Return result.

We will apply this function in the next example, but we will first test it to make sure that it is working correctly.

In a new code cell, create the following objects:

test

_

schema

= [

int

,

int, str

,

str

,

float

]

test

_

line

=' 6, 174,

Blah,Hello World,

7.37'

Use the process

_

line

()

function to split the string test

_

line at the commas, coercing the individual tokens into

the data types stored in test

_

schema. Print the result.

3

Problem

3

: Processing File Input

In this problem, you will write a function named read

_

file

_

_

list

()

that will read rows of text from a data file,

tokenize the rows into individual values, coerce those values into the proper data types, and then return the results in the

form of a list of lists.

The function should accept three parameters named path, schema, and sep.

path should be a string representing the path to a data file.

schema should be a list of data types indicating the desired types for the columns stored in the data file.

sep should be a string indicating the character used to separate values stored in the lines of the data file.

We will assume that the first line of the data files used will contain header information that assigns a name to each column.

This line will be tokenized

(

split

),

but the values will be left as strings, rather than being coerced according to schema.

As an example, assume that a datafile located at path contains the following lines of text:

Name,Age,PayRate

Anna,

27, 15.25

Bradley,

31, 16.75

Catherine,

23, 15.50

Define my

_

schema

= [

str

,

int, float

] .

Then the function call read

_

file

_

_

list

(

path

,

_

schema,

',')

should return the following list of lists:

[['

Name

',

'Age', 'PayRate'

], ['

Anna

', 27, 15.25], ['

Bradley

', 31, 16.75], ['

Catherine

', 23, 15.5]]

Write a function named read

_

file

_

_

list

()

that accepts three parameters path, schema, and sep as described above.

The function should perform the following steps:

1 .

Use with, open

(),

and read

()

to read the contents of the file into a string named contents.

2 .

Use split

()

to separate the string into a list named lines by splitting on the newline character.

3 .

Create an empty list named data. This will eventually contain the list that is to be returned.

4 .

The first line contains header information and will be processed differently from the other lines. Split the first

string in lines on sep. Store the resulting list into the list data.

5 .

We no longer need the first element of lines. For convenience, you may delete it

.

6 .

Loop over the remaining elements of lines. Each time the loop executes, use the function process

_

lines

()

to process the current line, appending the resulting list into data. Use the values provided to the parameters

schema and sep.

7 .

Return data.

We will now test the read

_

file

_

_

list

()

function on two small data files. We will start with a data file that contains

10

observations from the Diamonds Dataset. Make sure that the file diamonds

_

partial.txt is in the same directory as

your notebook. I suggest opening this file so that you can see what its contents look like.

Use read

_

file

_

_

list

()

to read the contends of the file diamonds

_

partial.txt

.

Individuals values within the

rows of this data file are separated by commas. The datatypes for the columns in this data set are given by the following

schema:

[

float

,

str

,

str

,

str

,

int

] .

Store the resulting list in a variable named diamond

_

data. Use a loop to

print the lists contained in diamond

_

data with each list appearing on its own line.

We will now test our function with a file containing

10

observations from the Titanic Dataset. Make sure that the file

titanic

_

partial.txt is in the same directory as your notebook. I suggest taking a look at the contents of this file.

Use read

_

file

_

_

list

()

to read the contends of the file titanic

_

partial.txt

.

Individual values

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer

Step: 1 Unlock blur-text-image

Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock

Step: 3 Unlock

Students Have Also Explored These Related Finance Questions!

General Instructions Create a new notebook named HW_05_YourLastName.ipynb and complete problems 1 - 8 described below. Download the files diamonds_partial.txt and titanic_partial.txt , placing them...

Using Python What am I doing wrong? ___________________________________________________________________________________________________ This is what I'm supposed to do This is what I'm getting...

can someone solve this Modern workstations typically have memory systems that incorporate two or three levels of caching. Explain why they are designed like this. [4 marks] In order to investigate...

Briefly describe ASCII and Unicode and draw attention to any relationship between them. [3 marks] (b) Briefly explain what a Reader is in the context of reading characters from data. [3 marks] A...

Suppose that R(A, B, C) is a relational schema with functional dependencies F = {A, B C, C B}. (i) Is this schema in 3NF? Explain. [2 marks] (ii) Is this schema in BCNF? Explain. [2 marks] (b)...

1. Introduction to Computer Architecture concepts; The MIPS R2000 architecture allows for up to four external coprocessors. Our local systems arIPS external coprocessors are used by describing, in...

re Regular Languages and Finite Automata (a) Let L be the set of all strings over the alphabet {a, b} that end in a and do not contain the substring bb. Describe a deterministic finite automaton...

State what is meant by a directed graph and a strongly connected component. Illustrate your description by giving an example of such a graph with 8 vertices and 12 edges that has three strongly...

I need to see the SPSS output. You need to have all z-scores, all charts, all descriptives data from SPSS, everything you used to answer the questions. I am sending you what the previous tutor sent...

The last picture is my work so far. Introduction Twitter and similar micro-blogging platforms allow users to broadcast short messages to a potentially large audience. With Twitter, the messages are...

a. Develop an AON network for this problem. b. Create a spreadsheet to calculate and summarize the earliest and latest start and finish times, the slack for each activity, and the critical...

Describe the microstructure present in a 1050 steel after each step in the following heat treatments: (a) Heat at 820 C, quench to 650 C and hold for 90 s, and quench to 25 C; (b) Heat at 820 C,...

You contributed to two retirement plans offered by Fidelity and TIAA CREF during your working years. Now, you have reached retirement and plan to consume $ 6 , 0 0 0 per month over the next 2 0...

#17 A taxable bond with a coupon rate of 8.00% has a market price of 98.38% of par. The bond matures in 19.00 years ans pays semi-annually. Assume an investor has a 23.00% marginal tax rate. The...