Post 10 | HDPCD | Load Pig Relation WITHOUT schema

  Hello everyone, hope you are finding the tutorials useful. In the previous tutorial, we started off with Data Transformation category of the HDPCD certification. This tutorial, being the second objective in this category, focuses on creating a sample pig relation without the schema. Before, starting with the actual process, let us define what is … Continue reading Post 10 | HDPCD | Load Pig Relation WITHOUT schema

Spark + Python : Passing Function

Spark + Python : Passing Function

In this tutorial, we are going to various ways in which we pass functions in Spark using Python API. I have shown two ways in which functions can be called/created (for user-defined function). We are going to do the comparison based on filtering capabilities of Spark. For doing this I have created a user-defined function called containsMilind() which … Continue reading Spark + Python : Passing Function