Skip to main content

Apache Spark Dataset and SQL in Java

apache spark logo

1) Introduction

In this tutorial, we are going to use Apache Spark 2 and discuss how to use the Dataset.

Note:- GitHub

2) Dependencies

3) Configure Spark in Local environment

Create a new Java class. SparkSQLExample.java

And create a Person.java as a POJO class.

4) Read a JSON file

Download sample json file here.

Read a json file.

Show the data.

To print the schema.

5) SQL operations

Get single columns in the json file.

Get the few columns from the json file.

GroupBy

Filters

Concat two columns

Convert to the Object

This is the complete example.

 


Related Post

68863total visits,15visits today

RSS
Follow by Email
Facebook
Facebook
Google+
http://mydevgeek.com/apache-spark-dataset-and-sql-in-java
Twitter