Jatin MadaanMar 15, 20206 min readProductionization Spark Code & Tuning How Spark Runs on a Cluster Architecture of a spark application contains following main components : Spark Driver : It is just a process...
Jatin MadaanApr 28, 20191 min readHive on Spark simple program ## PySpark code to run sql command . code : ## Importing HiveContext >>>> from pyspark.sql import Hive Context ## Create a SqlContext...
Jatin MadaanMar 10, 20191 min readAccessing Oracle using PySpark . To run oracle commands on oracle server using pyspark . For EMR First install software sudo su pip install cx_Oracle==6.0b1 Function 1 :...
Jatin MadaanFeb 6, 20192 min readSimple Pyspark Code to read sequential file,run sql query and storing Text File in hdfs : import time import sys import subprocess ## Getting start time of a job start_time= time.time() ## importing spark and Hive Context to...