Hive on Spark simple program
## PySpark code to run sql command . code : ## Importing HiveContext >>>> from pyspark.sql import Hive Context ## Create a SqlContext...
## PySpark code to run sql command . code : ## Importing HiveContext >>>> from pyspark.sql import Hive Context ## Create a SqlContext...
To Load data from a csv (it can be pipe,tab,comma seprated ) file : Step 1 : Create a table with delimiter as given in file Command :...
To run oracle commands on oracle server using pyspark . For EMR First install software sudo su pip install cx_Oracle==6.0b1 Function 1 :...
There is a simple command although it would run map reduce but still in case required . last_year=$(hive -e "select...
To copy files on local machine we can use command : aws s3 cp s3://bucket_name/folder_name/file_name.txt . there is a dot at end to...
We can perform almost all hadoop fs commands on s3 file system as well. Eg : hadoop fs -du -s -h s3://bucket_name/folder_name 10.1 G ...
While running hive query using hive -e or hive -f command merely writing rc=$? below hive command will not help , it will only tell if...
Get parameter such as workflow_name,start_date,end_date,parameter_file as input in a file . Loop through dates to get all date values...
To connect aws cluster (EMR or EC2) via terminal on mac . First make sure you download pem file from aws account. Once file has been...
Until loop until [[ $flag > 1 ]] do [code] done
alias sr="cd /[path_to_folder]"
import time import sys import subprocess ## Getting start time of a job start_time= time.time() ## importing spark and Hive Context to...