Question: The language to be used is Python 3 Questions: This company's sentiment data is used by quantitative analysts and traders as a factor in stock
The language to be used is Python 3
Questions:
This company's sentiment data is used by quantitative analysts and traders as a factor in stock selection models. The research we conduct looks at the relationship between sentiment data and future returns of stock prices.
Part I: Import the data, calculate the return, and merge the data.
1.1 Import SPY.csv
Instruction
-
Import SPY.csv, drop missing values, clean up Date, sort the dataframe by Date (earliest value first).
-
Calculate the daily open-to-close return (Buy at Open price, sell at Close price). Name this new value/column: ret_SPY
-
Calculate the future 20-day cumulative return for SPY. Name this new value/column: ret_SPY_c
-
Keep only Date, ret_SPY, and ret_SPY_c
Hints
For 1. Use .sort_values(), make sure to set 'inplace' to 'True', and 'ascending' to 'True'
For 3. Use this:
indexer = pd.api.indexers.FixedForwardWindowIndexer(window_size=20) SPY['ret_SPY_c'] = ((1.+ SPY['ret_SPY']).rolling(window=indexer, min_periods=20).agg(lambda x : x.prod()) - 1).shift(-1)
Note: windows_size = 20 trading days
** Your code below **
1.2 Import TSLA.csv
Instruction
-
Import TSLA.csv, drop missing values, clean up Date, sort the dataframe by Date (earliest value first).
-
Calculate the daily open-to-close return (Buy at Open price, sell at Close price). Name this new value/column: ret_TSLA
-
Calculate the future 20-day cumulative return for TSLA. Name this new value/column: ret_TSLA_c
-
Keep only Date, ret_TSLA, and ret_TSLA_c
** Your code below **
[ ]:
1.3 Import TSLA_Score.csv
TSLA_Score.csv contains the sentiment score computed from tweets that mentioned Tesla.
Instruction
-
Import TSLA_Score.csv, drop missing values, rename column date to Date, clean up Date, sort the dataframe by Date (earliest value first).
-
Keep only Date, s-score, sv-score, s-buzz
** Your code below **
1.4 Merge the data
Instruction
Merge SPY, TSLA, and TSLA_Score from 1.1, 1.2, and 1.3 by Date. Save it to a dataframe called testData.
** Your code below **
SKY.csv
| Date | Open | High | Low | Close | Volume | Dividend | Split | Adj_Open | Adj_High | Adj_Low | Adj_Close | Adj_Volume |
| 04-01-2019 | 284.7 | 286.16 | 284.4 | 285.83 | 77617945 | 0 | 1 | 284.7 | 286.16 | 284.4 | 285.83 | 77617945 |
| 3/29/2019 | 282.39 | 282.84 | 281.14 | 282.48 | 82160899 | 0 | 1 | 282.39 | 282.84 | 281.14 | 282.48 | 82160899 |
| 3/28/2019 | 280.35 | 281.21 | 279.07 | 280.71 | 56107818 | 0 | 1 | 280.35 | 281.21 | 279.07 | 280.71 | 56107818 |
| 3/27/2019 | 281.11 | 281.76 | 277.93 | 279.65 | 72174824 | 0 | 1 | 281.11 | 281.76 | 277.93 | 279.65 | 72174824 |
| 3/26/2019 | 280.99 | 282.18 | 279.56 | 281.12 | 68047406 | 0 | 1 | 280.99 | 282.18 | 279.56 | 281.12 | 68047406 |
| 3/25/2019 | 278.87 | 280.1853 | 277.64 | 279.04 | 85312509 | 0 | 1 | 278.87 | 280.1853 | 277.64 | 279.04 | 85312509 |
TSLA.csv
| Date | ticker | Adj_Open | Adj_Close |
| 04-01-2019 | TSLA | 282.62 | 289.18 |
| 3/29/2019 | TSLA | 278.7 | 279.86 |
| 3/28/2019 | TSLA | 277.16 | 278.62 |
| 3/27/2019 | TSLA | 268.75 | 274.83 |
| 3/26/2019 | TSLA | 264.44 | 267.77 |
| 3/25/2019 | TSLA | 259.71 | 260.42 |
Step by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
