Spark Lab2 – ingest

Using the tips file from the previous lab:

 

  • Create a new Dataframes with:
    • Only total_bill and tip columns 
    • Tip percentages instead of tip and total_bill columns
    • rename the column total_bill to bill
    • Add a timestamp column
  • Create a set of  parquet files partitioned by Day