Question: In the code below. What role does unix_timestamp function play in this date conversion?Where did the dd/MMM/yyy.... string come from? What is the significance of
In the code below.
What role does unix_timestamp function play in this date conversion?Where did the "dd/MMM/yyy...." string come from?
What is the significance of the final call to cast?
import re
from pyspark.sql.types import *
from pyspark.sql.functions import *
inputPath = "/databricks-datasets/sample_logs/"
df = sqlContext.read.text(inputPath)
#converted = df.select(unix_timestamp(regexp_extract(df["value"], ".+\[(.+) -", 1), "dd/MMM/yyyy:HH:mm:ss") \
#.cast(TimestampType()),
#split(df["value"], "")[8])
#display(converted.take(10))
Step by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
