Question: This question requires you to complete practical tasks using R , including data manipulation and transformation. You will work with provided datasets to demonstrate your

This question requires you to complete practical tasks using R, including data
manipulation and transformation. You will work with provided datasets to demonstrate your ability to apply programming concepts in real-world scenarios.
SECTION A Data Manipulation and Transformation (30 Marks)
Retail businesses continually seek to maximize sales and reduce stockouts, which
necessitates precise inventory management. In this case study, we examine a retail
chain that utilized three interconnected datasets to optimize its inventory levels and
sales strategies. Three datasets are provided as follows:
1. sales.csv
This dataset appears to record sales transactions from a retail chain. Each row
represents a unique sale, identified by a SalesId. The StoreId and ProductId
columns specify the store where the sale occurred and the product sold,
respectively. The Date column indicates when the sale was made. UnitPrice
reflects the price for a single unit of the product at the time of the sale, while
Quantity shows the number of units sold in that transaction. The data includes a
variety of products and spans several years, from at least 2017 to 2020, with
product prices ranging from as low as $0.0525 to as high as $9.205, and quantities
ranging from 8 to 98 units per transaction.
2. products.csv
The provided dataset is the list of products along with their supplier details and
costs. Each product is uniquely identified by a ProductId and is associated with a
ProductName and a Supplier. The ProductCost indicates the price of the product.
The dataset covers a range of items from groceries such as Chocolate Bar -
Smarties and Pepper - Red Bell to seafood like Cod - Salted, Boneless and
Clam - Cherrystone. It includes products from various suppliers including big box
stores like National Stores,Family Dollar,BJ's Wholesale Club, and Costco,
as well as other retail outlets like Ocean State Job Lot,Fred's, and Gabe's. The
product costs vary, ranging from as little as $0.10 for an 8oz coffee cup to $5.76
for oranges. This dataset can be used to manage inventory, analyze supplier cost
efficiency, or optimize pricing strategies.
3. inventory.csv
This dataset provides inventory details for various retail locations. Each entry
corresponds to a specific ProductId and includes information on StoreId,
StoreName, and the store's Address. The neighborhood column gives the local
area where the store is located, which can be particularly useful for geographic
data analysis and targeted marketing strategies. The QuantityAvailable column
shows the number of units of the product that are currently in stock at each store.
The data captures a range of quantities across multiple stores and locations, such
as 'National Stores' in Bolton Hill with 11 items available, to 'Ocean State Job Lot'
in Fells Point with just 1 item in stock. This information is crucial for supply chain
management, inventory control, and ensuring the availability of products across
the different branches of these retail chains.
1. Import all three datasets into RStudio. Filter the sales data only to include
transactions that occurred in 2020. How many sales transactions were there in
2020?(3 Marks)
2. Calculate the total revenue for each product in the sales data. Which ProductId
has the highest total revenue? Filter the ProductName of that ProductId from
products data and filter the StoreId of that product in inventory data. (5 Marks)
3. Group the inventory data by StoreId and summarize the average
QuantityAvailable across all products.Which StoreId has the lowest average
quantity available? (5 Marks)
4. Create a new column in the sales data that categorizes sales into 'High'
(Quantity >=50), 'Medium' (Quantity between 20 and 49), and 'Low' (Quantity
<20). What is the count of 'High' category sales? (5 Marks)
5. Arrange the product information dataset in descending order of ProductCost.
Then use tidyr to separate the ProductName into two columns: Product and
Brand, where the Brand is the substring following the '-' character. What is the
Brand of the third most expensive product? (5 Marks)
6. Using the dplyr package, analyze the price elasticity of demand for products in
the retail chain. For this, you will need to create a metric that measures the
percentage change in quantity sold (Quantity) in response to a percentage
change in UnitPrice. To do this, first, calculate the average price and quantity
sold for each product. Then, find the percentage changes between consecutive
time periods. Discuss which products are most and least sensitive to price
changes and how this could influence future pricing strategies. (7 Marks)

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Accounting Questions!