Question: You are a data engineer at a tech company. Your team is responsible for analyzing server log files to monitor user activity and system performance.

You are a data engineer at a tech company. Your team is responsible for analyzing server log files to monitor user activity and system performance. The log files are generated daily and stored in HDFS. Your task is to manage these files and prepare them for analysis.
Write the HDFS shell commands for the following tasks:
1. Create a Directory Structure in HDFS to organize the log files by using two commands. The structure should include:
/logs
/logs/daily
/logs/processed
2. Upload Daily Log Files and verify upload:
Assume you have a set of local log files in September named access_log_2024_09_01.txt, access_log_2024_09_02.txt,..., access_log_2024_09_30.txt in ~/data/local directory under Linux file system. Upload these files into the /logs/daily directory using one single command.
Then, use one single command to list the contents of the /logs/daily directory to confirm the files were uploaded successfully.
3. Check of file contents are relevant:
Use the hdfs cat command to display the first 5 lines of one of the log files (e.g., access_log_2024_09_01.txt). Check if the file contents are relevant.
4. Archive Processed Logs:
Move the September log files from /logs/daily to /logs/processed directory for archiving. You need to rename them with a prefix archived_ to indicate they have been processed. (Use one command only for this item 4)

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Programming Questions!