python - How to manage a Apache Spark context in Django? -
i have django application interacts cassandra database , want try using apache spark run operations on database. have experience django , cassandra i'm new apache spark.
i know interact spark cluster first need create sparkcontext, this:
from pyspark import sparkcontext, sparkconf conf = sparkconf().setappname(appname).setmaster(master) sc = sparkcontext(conf=conf)
my question following: how should treat context? should instantiate when application starts , let live during it's execution or should start sparkcontext everytime before running operation in cluster , kill when operation finishes?
thank in advance.
for last days i've been working on this, since no 1 answered post approach.
apparently creating sparkcontext generates bit of overhead, stopping context after every operation not idea.
also, there no downfall, apparently, on letting context live while application runs.
therefore, approach treating sparkcontext database connection, created singleton instantiates context when application starts running , used needed.
i hope can helpful someone, , open new suggestions on how deal this, i'm still new apache spark.
Comments
Post a Comment