Question: You are a data engineer at a tech company. Your team is responsible for analyzing server log files to monitor user activity and system performance.
You are a data engineer at a tech company. Your team is responsible for analyzing server log files to monitor user activity and system performance. The log files are generated daily and stored in HDFS Your task is to manage these files and prepare them for analysis.
Write the HDFS shell commands for the following tasks:
Create a Directory Structure in HDFS to organize the log files by using two commands. The structure should include:
logs
logsdaily
logsprocessed
Upload Daily Log Files and verify upload:
Assume you have a set of local log files in September named accesslogtxt accesslogtxt accesslogtxt in ~datalocal directory under Linux file system. Upload these files into the logsdaily directory using one single command.
Then, use one single command to list the contents of the logsdaily directory to confirm the files were uploaded successfully.
Check of file contents are relevant:
Use the hdfs cat command to display the first lines of one of the log files eg accesslogtxt Check if the file contents are relevant.
Archive Processed Logs:
Move the September log files from logsdaily to logsprocessed directory for archiving. You need to rename them with a prefix archived to indicate they have been processed. Use one command only for this item
Step by Step Solution
There are 3 Steps involved in it
1 Expert Approved Answer
Step: 1 Unlock
Question Has Been Solved by an Expert!
Get step-by-step solutions from verified subject matter experts
Step: 2 Unlock
Step: 3 Unlock
