Question: Functions to be used: PySpark SQL Aggregate Functions ( collect_set() , avg(), countDistinct(), count(), first(), last() ) Write a program create your own data file

Functions to be used: PySpark SQL Aggregate Functions ( collect_set() , avg(), countDistinct(), count(), first(), last() )

Write a program

create your own data file as a cvs file. Use this file in your code.

create the schema.

Use 6 DataFrame functions above.

Display your output for each use of a function.

You must write comments of what you are doing among the statements.

Place your comments in a print statement so that it is seen on the output as well as in the source code. Like Print(# This is a comment)

Use Pyspark and Pycharm.

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!