pyspark with HCatalog table in Zeppelin
By : Hawk
Date : March 29 2020, 07:55 AM
help you fix your problem This an expected behavior. Each table or DataFrame in Spark is bound to a specific SQLContext which has been used to create it and cannot be accessed outside it. Apache Zeppelin initializes both SparkContext and SQLContext which are shared between interpreters and can be accessed as sc and sqlContext respectively. You should these instances, not create another one, if you want to benefit from interpreter interoperability. In other words don't create custom context by use default one.
|
zeppelin user Spark interpreter save DataFrame to Hbase Table
By : ncanquest
Date : March 29 2020, 07:55 AM
around this issue hbaseRDD.saveAsNewAPIHadoopDataset(getJob("HWBTEST").getConfiguration) Solved the problem
|
Pyspark: Issues With Inserting Into a table in Hive using Zeppelin Notebook
By : Jovison
Date : March 29 2020, 07:55 AM
may help you . NOTE: This is being written in a Zeppelin Notebook. , You provided the following code. code :
%pyspark
from pyspark.context import SparkContext
sql = sqlContext.sql("INSERT INTO TABLE kenny_target(`user`, `age`) SELECT
`user`, COALESCE(`age`, 0L) FROM kenny_source")
frame= sqlContext.createDataFrame(sql).collect()
frame.write.mode("append").saveAsTable("kenny_source_test")
sql = sqlContext.sql("INSERT INTO TABLE kenny_target('user', 'age') SELECT 'user', COALESCE('age', 0L) FROM kenny_source")
frame = sqlContext.createDataframe(sql)
# Since, you wrote sql = sqlContext.sql('query')
# You could write frame = sql.collect()
frame = sqlContext.createDataFrame(sql).collect()
frame.write.mode("append").saveAsTable("kenny_source_test")
# SELECT user and age into 'frame' by creating a DataFrame called as 'frame'
frame = sqlContext.sql('''SELECT `user`, COALESCE(`age`, 0L) FROM kenny_source''')
# Write it to the table: kenny_source_test
frame.write.mode("append").saveAsTable("kenny_source_test")
|
Zeppelin Notebook %pyspark interpreter vs %python interpreter
By : Mikhail
Date : March 29 2020, 07:55 AM
I wish did fix the issue. When you run a %pyspark paragraph, zeppelin will create a spark context (spark variable) automatically with the defined parameters (loading spark packages, settings...).* Have a look at the documentation) of the spark-interpreter for some of the possibilities. In %python paragraph you can create a spark context by your own but it is not done automatically and will not use the defined parameters of the spark interpreter section. code :
%pyspark
spark
<SparkContext master=local[4] appName=ZeppelinHub>
%pyspark
spark
<pyspark.sql.session.SparkSession at 0x7fe757ca1f60>
|
Zeppelin - Cannot query with %sql a table I registered with pyspark
By : Sunny
Date : March 29 2020, 07:55 AM
should help you out Zeppelin can create different contexts for different interpreters it is possible that if you executed some code with %spark and some code with %pyspark interpreters your Zeppelin can have two contexts. And when you use %sql it is looking in another context not in %pyspark. Try restart Zeppelin and execute %pyspark code as first statement and than %sql as second. If you go to 'Interpreters' tab you can add zeppelin.spark.sql.stacktrace there. And after restart Zeppelin you will see full stack trace in a place where you have 'Table not found' now.
|