This is a code snippet using Spark on Qubole to load a CSV file into a DataFrame, register it as a temporary table, and create a permanent table from the data in the temporary table.
The first line of code reads the CSV file from an S3 location into a DataFrame. The options set for the format, delimiter, header, and inferSchema specify how the CSV file should be read and parsed.
val df = sqlContext.read.format("com.databricks.spark.csv")
.option("delimiter", "|")
.option("header", "true")
.option("inferSchema", "true")
.load("s3://*****.CSV")
The second line of code registers the DataFrame as a temporary table, which can be used for querying.
df.registerTempTable("temp-table")
The third line of code creates a permanent table in a specified database by executing an SQL query on the temporary table. The query selects all the columns and rows from the temporary table and creates a new table with the same data in the specified database.
sqlContext.sql("""
create table database.table as
select * from temp-table
""")
No comments:
Post a Comment