Spark SQL Overview

READ MORE

Like it? Share…What is Spark SQL and Dataframes? Spark SQL is a Spark module which lets you process and query structured data. Spark SQL uses special type of interface called Dataset which has all features of RDD plus can store extra information in order to do optimizations. When operations are performed over structured data in […]

Spark RDD Overview and Hands-on

READ MORE

Like it? Share…Introduction RDD or Resilient Distributed Dataset is the fundamental data structure of Spark. It can be considered Spark’s main programming abstraction and resides in Spark Core component. RDD is a collection of items distributed across many cluster nodes that can be manipulated in parallel. Also note that Spark‚Äôs RDDs are by default recomputed […]

Apache Spark Overview

READ MORE

Like it? Share…Apache Spark is an open-source cluster computing framework with a fast in-memory data processing engine. It is multiple times faster than MapReduce and provides libraries for development in R, Python, Scala and Java. It provides streaming, SQL, Machine Learning and graph processing capabilities. It can run on Hadoop, Mesos, standalone or in the […]

Data Access And Analysis in Hadoop

READ MORE

Like it? Share…This is part III of Big Data Overview Blogs for developers: 1. Part I : What is Big Data, What is Hadoop and Hadoop Ecosystem, managing Hadoop Cluster. 2. Part II : Data Ingestion in Hadoop 3. Part III : Data Access And Analysis in Hadoop In part I, I covered the basics […]

Data Ingestion in Hadoop

READ MORE

Like it? Share…This is part II of Big Data Overview Blogs : 1. Part I : What is Big Data, What is Hadoop and Hadoop Ecosystem, managing Hadoop Cluster. 2. Part II : Data Ingestion in Hadoop 3. Part III : Data Access And Analysis in Hadoop Data ingestion can be understood well by first […]

Big Data Overview

READ MORE

Like it? Share…This is part I of Big Data Overview Blogs for developers : Part I : What is Big Data, What is Hadoop and Hadoop Ecosystem, managing Hadoop Cluster. Part II : Data Ingestion in Hadoop Part III : Data Access And Analysis in Hadoop What is big data? Technology usage has grown exponentially […]