spark - connecting to mongodb using mongodb connector




One way of connectg to mongodb database using mongodb (not the usual mongodb driver), is using the following codes.

First it starts off with command line (to download driver from maven repository) and then run the code to connect and show.


# Start the code with the followings..

# pyspark --conf "spark.mongodb.input.uri=mongodb://127.0.0.1/apptest.product?readPreference=primaryPreferred" \
# --conf "spark.mongodb.output.uri=mongodb://127.0.0.1/apptest.product" \
# --packages org.mongodb.spark:mongo-spark-connector_2.11:2.4.1


from pyspark.sql import SparkSession

spark = SparkSession.builder.appName("mongo-email-pipeline").config("spark.mongodb.input.uri", "mongodb://127.0.0.1/apptest.product").config("spark.mongodb.output.uri", "mongodb://127.0.0.1/apptest.product").getOrCreate()

df = spark.read.format("mongo").load()

df.show()

Comments

Popular posts from this blog

The specified initialization vector (IV) does not match the block size for this algorithm