Question: Question 1) Using the following data structure below, create a dataframe by adding data types and column names. Colum names and the corresponding data types
Question 1) Using the following data structure below, create a dataframe by adding data types and column names. Colum names and the corresponding data types are:
| Column | Data Type |
|---|---|
| Id | INT |
| First | STRING |
| Last | STRING |
| Url | STRING |
| Published | STRING |
| Hits | INT |
| Campaigns | ARRAY[STRING] |
Print the schema of your DataFrame. Explain what is the main advantage of adding data types while creating DataFrames.
data = [ [1, "Jules", "Damji", "https://tinyurl.1", "1/4/2016", 4535, ["twitter", "LinkedIn"]], [2, "Brooke","Wenig", "https://tinyurl.2", "5/5/2018", 8908, ["twitter", "LinkedIn"]], [3, "Denny", "Lee", "https://tinyurl.3", "6/7/2019", 7659, ["web", "twitter", "FB", "LinkedIn"]], [4, "Tathagata", "Das", "https://tinyurl.4", "5/12/2018", 10568, ["twitter", "FB"]], [5, "Matei","Zaharia", "https://tinyurl.5", "5/14/2014", 40578, ["web", "twitter", "FB", "LinkedIn"]], [6, "Reynold", "Xin", "https://tinyurl.6", "3/2/2015", 25568, ["twitter", "LinkedIn"]], ]
Question 2) Add a new column to the DataFrame created in question four with the following specs:
Column name is Big Hitters
Values will be True or False. True if the column Hits bigger than 10000, else False
Step by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
