

Imported the data from different sources like AWS S3, LFS into Spark RDD.Worked and learned a great deal from Amazon Web Services (AWS) Cloud services like EC2, S3, EBS, RDS and VPC.Used the JSON and XML SerDe's for serialization and de-serialization to load JSON and XML data into HIVE tables.Imported data from AWS S3 and into Spark RDD and performed transformations and actions on RDD's.Implemented Apache PIG scripts to load data from and to store data into Hive.Used Impala for querying HDFS data to achieve better performance.Designed and created Hive external tables using shared meta-store instead of derby with partitioning, dynamic partitioning and buckets.Implemented Spark using Scala and SparkSQL for faster testing and processing of data.Used Spark API over Hortonworks Hadoop YARN to perform analytics on data in Hive.Developing Spark programs using Scala API's to compare the performance of Spark with Hive and SQL.*Developed web application in open source java framework Spring. *Experience in designing the User Interfaces using HTML, CSS, JavaScript and JSP. *Hands-on knowledge on core Java concepts like Exceptions, Collections, Data-structures, Multi-threading, Serialization and deserialization. *Good level of experience in Core Java, J2EE technologies as JDBC, Servlets, and JSP. *Involved in Cluster coordination services through Zookeeper. *Uploaded and processed terabytes of data from various structured and unstructured sources into HDFS (AWS cloud) using Sqoop and Flume. Experience in manipulating/analyzing large datasets and finding patterns and insights within structured and unstructured data. *Experience in NoSQL Column-Oriented Databases like Hbase, Cassandra and its Integration with Hadoop cluster. *Experience in analyzing data using HiveQL, Pig Latin, and custom Map Reduce programs in Java *Experience data processing like collecting, aggregating, moving from various sources using Apache Flume and Kafka *Experience in working with flume to load the log data from multiple sources directly into HDFS. *Migrating the coding from Hive to Apache Spark and Scala using Spark SQL, RDD. *Experience in creating tables, partitioning, bucketing, loading and aggregating data using Hive.

*Experience in transferring data from RDBMS to HDFS and HIVE table using SQOOP. *Experience in usage of Hadoop distribution like Cloudera 5.3(CDH5,CDH3), Horton works distribution & Amazon AWS *Expertise in using Spark-SQL with various data sources like JSON, Parquet and Hive.

*Hands on experience in various big data application phases like data ingestion, data analytics and data visualization.

*Worked on HBase to perform real time analytics and experienced in CQL to extract data from Cassandra tables. *Experience in using Accumulator variables, Broadcast variables, RDD caching for Spark Streaming. *Experienced in writing MapReduce programs in Java to process large data sets using Map and Reduce Tasks.
#STRUTS SUPPORT RATIONAL APPLICATION DEVELOPER SOFTWARE#
*Hands on experience in Analysis, Design, Coding and Testing phases of Software Development Life Cycle (SDLC). *In-depth understanding of Spark Architecture including Spark Core, Spark SQL, Data Frames, Spark Streaming, Spark MLib *Hands on experience in installing, configuring and using Hadoop ecosystem components like HDFS, MapReduce Programming, Hive, Pig, Yarn, Sqoop, Flume, Hbase, Impala, Oozie, Zoo Keeper, Kafka, Spark.
