Edureka’s PySpark Certification Training is designed to provide you with the knowledge and skills that are required to become a successful Spark Developer using Python and prepare you for the Cloudera Hadoop and Spark Developer Certification Exam (CCA175).
Throughout the PySpark Training, you will get an in-depth knowledge of Apache Spark and the Spark Ecosystem, which includes Spark RDD, Spark SQL, Spark MLlib and Spark Streaming. You will also get comprehensive knowledge of Python Programming language, HDFS, Sqoop, Flume, Spark GraphX and Messaging System such as Kafka.
Who can attend?
- Developers and Architects
- BI /ETL/DW Professionals
- Senior IT Professionals
- Mainframe Professionals
- Big Data Architects, Engineers and Developers
- Data Scientists and Analytics Professionals
- Introduction to Big Data Hadoop and Spark
- Introduction to Python for Apache Spark
- Functions, OOPs, and Modules in Python
- Deep Dive into Apache Spark Framework
- Playing with Spark RDDs
- DataFrames and Spark SQL
- Machine Learning using Spark MLlib
- Deep Dive into Spark MLlib
- Understanding Apache Kafka and Apache Flume
- Apache Spark Streaming – Processing Multiple Batches
- Apache Spark Streaming – Data Sources
- Implementing an End-to-End Project
- Spark GraphX (Self-Paced)
Skills you will learn?
- Master the concepts of HDFS
- Understand Hadoop 2.x Architecture
- Learn data loading techniques using Sqoop
- Understand Spark and its Ecosystem
- Implement Spark operations on Spark Shell
- Understand the role of Spark RDD
- Work with RDD in Spark
- Implement Spark applications on YARN (Hadoop)
- Implement machine learning algorithms like clustering using Spark MLlib API
- Understand Spark SQL and it’s architecture
- Understand messaging system like Kafka and its components
- Integrate Kafka with real time streaming systems like Flume
- Use Kafka to produce and consume messages from various sources including real time streaming sources like Twitter
- Learn Spark Streaming
- Use Spark Streaming for stream processing of live data
- Solve multiple real-life industry-based use-cases which will be executed using Edureka’s CloudLab
For full details and online registration visit the link below.
Note: NoticeBard is associated with Edureka via an affiliate programme.