Geek Logbook

Tech sea log book

Understanding .master() in Apache Spark

In Apache Spark, the .master() method is used to specify how your application will run, either on your local machine or on a cluster. Choosing the correct option is essential depending on your environment. This post will explain the different .master() options in Spark and when to use them. Local Mode The local mode runs

Running PySpark on Google Colab: Do You Still Need findspark?

Introduction For a long time, using Apache Spark in Google Colab required manual setup, including installing Spark and configuring Python to recognize it. This was often done using the findspark library. However, recent changes in Colab have made this process much simpler. In this post, we will explore whether findspark is still necessary and the