Insights
Bridging the Gap: GCP, Kaggle, and Spark for Aspiring Data Engineers
In this blog post, we will tackle a real-world data engineering challenge. We will target the datasets via the open-sourced platform Kaggle and use the GCP services to run PySpark functions for the ETL tasks and manage the queries and results for future analytics and reporting purposes.