Question: CS 135 Assignment 10 Data Mining Trump The time is 2020 and more than ever the saying, Once it's on the internet, it's there forever

 CS 135 Assignment 10 Data Mining Trump The time is 2020

CS 135 Assignment 10 Data Mining Trump The time is 2020 and more than ever the saying, \"Once it's on the internet, it's there forever\" holds true. This assignment will be a little different than the assignments you have done before. This time we are going to apply our C++ Programming Skills to learn and discover information that would otherwise take us days, months, years to otherwise process. In short we are going to perform what is known as Data Analytics. Twitter is a social media platform that allows anyone to create an account and post public messages that contain 280 characters (apparently, they upped it from 140 at some point). Since its Inception twitter has grown very popular with celebrities as a way of communicating their thoughts to the public as if they were texting their fans and allow them to potentially interact with them. Because of its popularity it has also been used by government entities as a way of trying to reach young citizens and potential voters for good or for bad. After all, good is a point of view, Anakin. One of the most controversial figures to use twitter today is our very own 45 th President of the United States of America: Donald J. Trump. (Play Air Force One Theme Here) and regardless of how you feel about him, or his political views, I think we can all agree that his twitter has had a great impact in todays culture. So with that we arrive to the goal of this assignment. Let's analyze Trump's tweets! But before I go any further I know some of you may not feel comfortable reading Trump's tweets, so I have created (stackoverflow copy paste was heavily involved) a script that can download anyone's public tweet history and you can use that for the assignment as well. As a test I've attached NASA's tweets as well so you can use those instead too. The program will work with anyone's tweets be it Trump, NASA, or any public account since what your program is reading is a file with a specific format. You can even make your own test file as well. So, if you feel uneasy with Trump's tweets. Go on discord and pick a Tweet account and I'll run my script and get you the tweets for that account. It would be cool to explore something that interest you and post anything cool you find. In fact, maybe we can do Joe Biden's account and go full political in a programming class LOL. Please no The goal of the assignment will be to process the input file and read it into an array of structs each representing a tweet. Then you will perform some very basic data analytics tasks. A full description of these will follow later but before let me discuss how the input file looks like by pasting the first few lines: source,text,created_at,retweet_count,favorite_count,is_retweet,id_str Twitter for iPhone,Disgraceful! https://t.co/K5YEpOzvL5,06-04-2020 12:38:46,7474,28291,false,1268522581944082433 Twitter for iPhone,MAKE AMERICA GREAT AGAIN!,06-04-2020 10:59:50,31732,172210,false,1268497685553823745 Twitter for iPhone,LAW & ORDER!,06-04-2020 10:58:41,19797,107755,false,1268497398239842307 Twitter for iPhone,The Fake Newspaper! https://t.co/X6LEqpQeBc,06-042020 10:58:17,8350,26343,false,1268497297702367235 The file is in a simplified format of a CSV file which stands for Comma Separated Values. The first line of the file describes what each value represents. Think of it like an excel spreadsheet and each line in the file represents a row of the spreadsheet with the first row being the title of the columns. This means that you have the following information for each tweet: source: This describes what device was used to create the tweet such as iphone, android, website etc.. text: this is the raw text of the tweet with some characters removed like commas to avoid breaking the CSV format created_at: This it the time the Tweet was posted retweet_count: This is the number of times the tweet has been retweeted favorite_count: This is the number of times the tweet has been favorited by other users is_retweet: This states true or false depending on if it is a retweet or not, I removed all the tweets that were in test file but still make sure you process this. id_str: Each tweet has a unique identifier attached to it in Twitter's servers, that's what this is. The value of each cell is separated by a comma. This is a very important thing to remember so you can properly parse the file. Onto the program itself: The first thing you must do is open the file using command line arguments such as: ./a.out twitter.txt The second argument represent the CSV file containing the tweets. If the user does not pass an argument with the file name, the program automatically terminates after printing out \"Invalid Input File\" Next create a vector to store the tweets using vector tweets; Note that Tweet is a struct that contains one variable per bullet point given above. I used a mixture of strings and long unsigned integers to create it. I also made an enum for the source but you can keep it as a string if you prefer. Then loop through the entire file using filestreams. To insert into a vector you can use the following command: tweets.push_back(tweet); That will insert one struct of Tweet type into your tweets vector. Afterwards you can treat it like you would treat any array. So tweets[1] will give you the second struct in the array (so the second tweet). A useful function to keep track of the size of the vector/array is size: tweets.size() Next after you have inserted all tweets into the vector you can start querying the user (see sample output for an example): 1. First print out how many tweets were read. 2. Next ask the user to enter a phrase to count how many tweets contain it. Then show the result. To do this you will write a function that will take the vector and the string to search for. This can help you search for hashtag trends! 3. Next ask the user to enter another phrase and find all of the tweets that contain it and then add all of those tweet's retweets and print out the final number. In other words, we want to find how popular a specific phrase was. 4. Next as the user to enter a phrase to search for and print all of the tweets that contain it. In this case you will match the text of all tweets and print all the tweets with that phrase. All 3 of these should be printed to the stdout 5. Next you will open a file using filestreams and write to it all of the tweets in a nice format (see sample output) first ask the user what file name he wants and then open that file and write to it. 6. Finally, you will find and print the top 10 most favorited tweets and also the least favorited tweet. My suggestion is to use a selection sort style of searching but limit it to the first 10. You could also sort the entire data but that may take a long time. Note that the least favorite tweet is probably going to be a recent tweet with 0 favorites. So don't be surprised at this. That's it! You have now begun your path towards data analytics as you can now start to analyze trends in twitter by seeing how specific phrases in your tweets affect the retweet-ability and favorite-ability of them! Here is a list of all the functions you need: //Function Prototypes: void setSource(Tweet&, string); string getSource(const Source); void printTweet(ostream&, const Tweet&); void readTweets(char**, vector&); int countTweets(const vector&, const string); int countRetweets(const vector&, const string); void phrasePrinter(ostream&, const vector&, const string); void fprintAllTweets(const vector&); void sortNprintByFavorite(ostream&, vector&); bool strFind(const string, const string); Your main should be combination of function calls to these functions along with some getline calls to read user input and cout statements. Be sure to test your program for edge cases! Notes: -Comment your source code appropriately. -Make sure you name your program file in accordance with the syllabus stipulations -Make sure you test your program on sally. -No STL or any library functions not authorized. You will need to create your own find/compare and swap functions. Sample Output (using Trump's tweets): AST10$ g++ ast10.cpp -std=c++11 AST10$ ./a.out twitter.txt 36128 Tweets read from 45th President of the United States Donald J. Trump. Enter a Phrase to Count how many Tweets contain it: #MAGA #MAGA Found in 365 Tweets! :o Enter Phrase to add all retweets of a tweet that contains the given phrase: Fake News There were 15145729 retweets of tweets containing the term 'Fake News'! Enter a phrase to search for and print all tweets that contain it: AR-15 Tweets with term 'AR-15': Tweet Source: Twitter for iPhone Text: RT @Liz_Wheeler: Your daily reminder that Democrats want to:- Open bordersAbolish private health insurance- Ban your AR-15s- Give il Created at: 01-15-2020 06:06:23 Retweet Count: 19263 Favorite Count: 0 Not a Retweet ID STR: 1217327146747727872 Enter Name of File to Save all Tweets to: tweets.txt Top 10 Favorited Tweets: Tweet Source: Twitter for iPhone Text: A$AP Rocky released from prison and on his way home to the United States from Sweden. It was a Rocky Week get home ASAP A$AP! Created at: 08-02-2019 17:41:30 Retweet Count: 251530 Favorite Count: 879647 Not a Retweet ID STR: 1157345692517634049 Tweet Source: Twitter for iPhone Text: The United States of America will be designating ANTIFA as a Terrorist Organization. Created at: 05-31-2020 16:23:43 Retweet Count: 225693 Favorite Count: 830795 Not a Retweet ID STR: 1267129644228247552 Tweet Source: Twitter for iPhone Text: https://t.co/VXeKiVzpTf Created at: 01-03-2020 02:32:53 Retweet Count: 172157 Favorite Count: 814012 Not a Retweet ID STR: 1212924762827046918 Tweet Source: Twitter for iPhone Text: CHINA! Created at: 05-29-2020 13:01:56 Retweet Count: 153916 Favorite Count: 799755 Not a Retweet ID STR: 1266354084036194306 Tweet Source: Twitter for iPhone Text: All is well! Missiles launched from Iran at two military bases located in Iraq. Assessment of casualties & damages taking place now. So far so good! We have the most powerful and well equipped military anywhere in the world by far! I will be making a statement tomorrow morning. Created at: 01-08-2020 02:45:24 Retweet Count: 158004 Favorite Count: 764333 Not a Retweet ID STR: 1214739853025394693 Tweet Source: Twitter for iPhone Text: MERRY CHRISTMAS! Created at: 12-25-2019 12:26:31 Retweet Count: 115372 Favorite Count: 735775 Not a Retweet ID STR: 1209812664601522178 Tweet Source: Twitter for iPhone Text: Kobe Bryant despite being one of the truly great basketball players of all time was just getting started in life. He loved his family so much and had such strong passion for the future. The loss of his beautiful daughter Gianna makes this moment even more devastating.... Created at: 01-26-2020 23:54:34 Retweet Count: 94246 Favorite Count: 735478 Not a Retweet ID STR: 1221582230008619016 Tweet Source: Twitter for iPhone Text: Just spoke to @KanyeWest about his friend A$AP Rockys incarceration. I will be calling the very talented Prime Minister of Sweden to see what we can do about helping A$AP Rocky. So many people would like to see this quickly resolved! Created at: 07-19-2019 20:01:47 Retweet Count: 210186 Favorite Count: 734567 Not a Retweet ID STR: 1152307567634391041 Tweet Source: Twitter for iPhone Text: https://t.co/11nzKwOCtU Created at: 11-27-2019 15:54:39 Retweet Count: 201773 Favorite Count: 700861 Not a Retweet ID STR: 1199718185865535490 Tweet Source: Twitter for iPhone Text: The United States just spent Two Trillion Dollars on Military Equipment. We are the biggest and by far the BEST in the World! If Iran attacks an American Base or any American we will be sending some of that brand new beautiful equipment their way...and without hesitation! Created at: 01-05-2020 05:11:03 Retweet Count: 146706 Favorite Count: 684981 Not a Retweet ID STR: 1213689342272659456 Least Favorited Tweet: Tweet Source: Twitter for iPhone Text: RT @Breaking911: Neil Cavuto who just told viewers they would die if they took Hydroxychloroquine speaks to a doctor who says the drug ca Created at: 05-19-2020 01:46:25 Retweet Count: 9668 Favorite Count: 0 Not a Retweet ID STR: 1262560207013584898 Total Tweets Read: 36128 Thanks for Playing! Sample Output (using NASA's tweets): AST10$ g++ ast10.cpp -std=c++11 AST10$ ./a.out NASA_twitter.txt 2266 Tweets read from NASA. Enter a Phrase to Count how many Tweets contain it: #LaunchAmerica #LaunchAmerica Found in 202 Tweets! :o Enter Phrase to add all retweets of a tweet that contains the given phrase: Mars There were 90178 retweets of tweets containing the term 'Mars'! Enter a phrase to search for and print all tweets that contain it: Titan Tweets with term 'Titan': Tweet Source: Other Text: "b""Saturns moon Titan is drifting away from the planet 100 times faster than previously understood. \ \ These @NASASolarSystem findings. based on data from our Cassini mission. are an important piece to understanding the Saturn system's creation. More: https://t.co/3V2fqEuVDy https://t.co/D62EYu901X""" Created at: 2020-06-09 00:38:00 Retweet Count: 1868 Favorite Count: 12334 Not a Retweet ID STR: 1270153137270251520 Tweet Source: Other Text: "b""Our @NASASolarSystem mission #Dragonfly will send a rotocraft to the surface of Saturn's moon Titan. searching for the building blocks of life. Dive into the exploration goals of what will be the first multi-rotor vehicle to fly science beyond Earth: https://t.co/kfGXoRjqgy https://t.co/VaXYCAQM6e""" Created at: 2020-02-26 01:30:47 Retweet Count: 1247 Favorite Count: 5716 Not a Retweet ID STR: 1232478082885226496 Tweet Source: Other Text: "b"" A dynamic world of dunes. plains. craters & terrains is revealed in this first-ever geologic map of Saturn's moon. Titan. Lakes and seas are marked blue. but they aren't water! What rains down is methane and ethane in Titan's frigid climate. Zoom in: https://t.co/ufdpxYNEqP https://t.co/lNoA7lkhjo""" Created at: 2019-11-19 01:30:01 Retweet Count: 453 Favorite Count: 2238 Not a Retweet ID STR: 1196601487369134080 Enter Name of File to Save all Tweets to: nasa_tweets.txt Top 10 Favorited Tweets: Tweet Source: Other Text: "b""This is the first time in human history @NASA_Astronauts have entered the @Space_Station from a commercially-made spacecraft. @AstroBehnken and @Astro_Doug have finally arrived to the orbiting laboratory in @SpaceX's Dragon Endeavour spacecraft. https://t.co/3t9Ogtpik4""" Created at: 2020-05-31 17:24:06 Retweet Count: 64511 Favorite Count: 262034 Not a Retweet ID STR: 1267144838153211905 Tweet Source: Other Text: b'We have liftoff. History is made as @NASA_Astronauts launch from @NASAKennedy for the first time in nine years on the @SpaceX Crew Dragon: https://t.co/alX1t1JBAt' Created at: 2020-05-30 19:24:09 Retweet Count: 71188 Favorite Count: 205584 Not a Retweet ID STR: 1266812660374528001 Tweet Source: Other Text: "b""We don't know who needs to hear this. but you are made of star stuff. https://t.co/RlYHAd0xKy""" Created at: 2020-05-04 23:55:00 Retweet Count: 53893 Favorite Count: 180485 Not a Retweet ID STR: 1257458740938387463 Tweet Source: Other Text: "b""We're saddened by the passing of celebrated #HiddenFigures mathematician Katherine Johnson. Today. we celebrate her 101 years of life and honor her legacy of excellence that broke down racial and social barriers: https://t.co/Tl3tsHAfYB https://t.co/dGiGmEVvAW""" Created at: 2020-02-24 14:49:57 Retweet Count: 67432 Favorite Count: 162581 Not a Retweet ID STR: 1231954422785363968 Tweet Source: Other Text: b'ICYMI: Earth is beautiful. https://t.co/b9A2TULaAn' Created at: 2020-05-22 00:55:18 Retweet Count: 24982 Favorite Count: 126307 Not a Retweet ID STR: 1263634509360173066 Tweet Source: Other Text: b'Vehicle is supersonic. #LaunchAmerica https://t.co/ea7iteD2j9' Created at: 2020-05-30 19:25:00 Retweet Count: 26548 Favorite Count: 114611 Not a Retweet ID STR: 1266812876125323267 Tweet Source: Other Text: b'Welcome aboard the @SpaceX Crew Dragon spacecraft! \ \ In this video from space. @AstroBehnken and @Astro_Doug reveal the name of their capsule: Endeavour. Take a look inside as the crew continues their journey to the @Space_Station: https://t.co/K9S5mejONx https://t.co/mvH8UhE5FW' Created at: 2020-05-31 00:11:59 Retweet Count: 17475 Favorite Count: 89774 Not a Retweet ID STR: 1266885097359388672 Tweet Source: Other Text: b'LIVE NOW: We are launching astronauts to the @Space_Station from @NASAKennedy for the first time in nine years. Liftoff is at 3:22pm ET. #LaunchAmerica\ \ https://t.co/UPmFv01Adf https://t.co/UPmFv01Adf' Created at: 2020-05-30 19:03:12 Retweet Count: 34565 Favorite Count: 88297 Not a Retweet ID STR: 1266807391305314305 Tweet Source: Other Text: b'LIVE NOW: History is about to be made. Watch as @NASA_Astronauts #LaunchAmerica to the @Space_Station from American soil for the first time in nine years: https://t.co/U1COQzFy4v https://t.co/U1COQzFy4v' Created at: 2020-05-27 16:21:26 Retweet Count: 42108 Favorite Count: 72842 Not a Retweet ID STR: 1265679515193409541 Tweet Source: Other Text: "b'""We are not going to launch today.""\ \ Due to the weather conditions. the launch is scrubbing. Our next opportunity will be Saturday. May 30 at 3:22pm ET. Live #LaunchAmerica coverage will begin at 11am ET. https://t.co/c7R1AmLLYh'" Created at: 2020-05-27 20:21:02 Retweet Count: 22538 Favorite Count: 67258 Not a Retweet ID STR: 1265739813837307906 Least Favorited Tweet: Tweet Source: Other Text: "b""@TommasoastroTo1 Docking is currently scheduled for 10:15am ET with our live coverage starting at 9:30am ET. Be sure to tune in to see the crew's arrival!""" Created at: 2020-04-09 08:15:21 Retweet Count: 8 Favorite Count: 0 Not a Retweet ID STR: 1248162572047613953 Total Tweets Read: 2266 Thanks for Playing

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Programming Questions!