Question: For this assignment, use the cc _ info.csv and transactions.csv files. Your submission should include: 1 . The Python script. 2 . Sample output. 3

For this assignment, use the cc_info.csv and transactions.csv files. Your
submission should include:
1. The Python script.
2. Sample output.
3. An explanation of your approach and reasoning behind your code.
Each question is worth 10 points, broken down as follows:
Python script: 6 points
Output: 2 points
Code explanation and reasoning: 2 points
Question 1: Suspicious Transaction Analysis (10 points)
Company ABC suspects that transactions that exceed a certain threshold within a short period
of time might be fraudulent. Using the transactions DataFrame:
1. Identify all transactions that are greater than $300.
2. Sort these high-value transactions by the credit_card column and then by date.
3. For each credit_card, calculate the time difference in days between consecutive
high-value transactions.
4. Add a new column called suspicious to indicate whether a transaction is considered
suspicious. A transaction is considered suspicious if:
It occurs within 3 days of the previous high-value transaction for the same
credit_card.
5. Print the credit_card, transaction_dollar_amount, date, suspicious
columns for all high-value transactions.
Hint: Use pd.to_datetime() to ensure date format compatibility and diff() to calculate
consecutive differences in days.
Question 2: Outlier Detection in Transaction Amounts (10 points)
Outlier detection can help identify unusual spending patterns. Using the transactions
DataFrame:
1. Calculate the median and interquartile range (IQR) for the
transaction_dollar_amount column.
2. Define transactions as "outliers" if their transaction_dollar_amount is either below
Q1-1.5* IQR or above Q3+1.5* IQR.
3. Create a new column named is_outlier in the transactions DataFrame. Set it to
True for outliers and False for non-outliers.
4. Calculate and print the percentage of transactions flagged as outliers.
Hint: Use NumPy functions like np.percentile() to calculate Q1(25th percentile) and Q3
(75th percentile).
Question 3: Daily Transaction Volume Analysis by Location (10 points)
Fraud patterns may also emerge from analyzing transaction volumes over time and by location.
Using both cc_info and transactions DataFrames:
1. Merge cc_info and transactions on the credit_card column.
2. Group the merged DataFrame by date and state, and calculate the total transaction
amount and transaction count for each state on each date.
3. Identify the top 5 states with the highest total transaction amounts for each date and
print the results.
4. Additionally, calculate the daily transaction amount average for each state and print the
top 3 states with the highest averages over the entire period.
Hint: Use groupby() to aggregate daily transaction data and sort to identify the top states. For this assignment, use the cc_info.csv and transactions.csv files. Your submission should include:
1. The Python script.
2. Sample output.
3. An explanation of your approach and reasoning behind your code.
Each question is worth 10 points, broken down as follows:
- Python script: 6 points
- Output: 2 points
- Code explanation and reasoning: 2 points
Question 1: Suspicious Transaction Analysis (10 points)
Company ABC suspects that transactions that exceed a certain threshold within a short period of time might be fraudulent. Using the transactions DataFrame:
1. Identify all transactions that are greater than \(\$ 300\).
2. Sort these high-value transactions by the credit_card column and then by date.
3. For each credit_card, calculate the time difference in days between consecutive high-value transactions.
4. Add a new column called suspicious to indicate whether a transaction is considered suspicious. A transaction is considered suspicious if:
- It occurs within 3 days of the previous high-value transaction for the same credit_card.
5. Print the credit_card, transaction_dollar_amount, date, suspicious columns for all high-value transactions.
Hint: Use pd.to_datetime () to ensure date format compatibility and diff() to calculate consecutive differences in days.
Question 2: Outlier Detection in Transaction Amounts (10 points)
Outlier detection can help identify unusual spending patterns. Using the transactions DataFrame:
1. Calculate the median and interquartile range (IQR) for the transaction_dollar_amount column.
2. Define transactions as "outliers" if their transaction_dollar_amount is either below Q1-1.5* IQR or above Q3+1.5* IQR. For this assignment, use the cc_info.csv and transactions.csv files. Your submission should include:
1. The Python script.
2. Sample output.
3. An explanation of your approach and reasoning behind your code.
Each question is worth 10 points, broken down as follows:
- Python script: 6 points
- Output: 2 points
- Code explanation and reasoning: 2 points
Question 1: Suspicious Transaction Analysis (10 points)
Company ABC suspects that transactions that exceed a certain threshold within a short period of time might be fraudulent. U 3. Create a new column n
For this assignment, use the cc _ info.csv and

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Programming Questions!