👁 University of Illinois Urbana-Champaign

Cloud Computing Applications, Part 2: Big Data and Applications in the Cloud

Ends soon! Keep adding new skills with 10,000+ programs for $239 (usually $399). Save now.

👁 University of Illinois Urbana-Champaign

Cloud Computing Applications, Part 2: Big Data and Applications in the Cloud

This course is part of Cloud Computing Specialization

👁 Reza Farivar

👁 Roy H. Campbell

Instructors: Reza Farivar

34,445 already enrolled

Included with

•

Learn more

5 modules

Gain insight into a topic and learn the fundamentals.

4.3

343 reviews

Flexible schedule

2 weeks at 10 hours a week

Learn at your own pace

90%

Most learners liked this course

5 modules

Gain insight into a topic and learn the fundamentals.

4.3

343 reviews

Flexible schedule

2 weeks at 10 hours a week

Learn at your own pace

90%

Most learners liked this course

Skills you'll gain

Tools you'll learn

Details to know

👁 Image

Shareable certificate

Add to your LinkedIn profile

Assessments

5 assignments

Taught in English

Prepare for a degree

Learn more

See how employees at top companies are mastering in-demand skills

Learn more about Coursera for Business

👁 logos of Petrobras, TATA, Danone, Capgemini, P&G and L'Oreal

Build your subject-matter expertise

This course is part of the Cloud Computing Specialization

When you enroll in this course, you'll also be enrolled in this Specialization.

Learn new concepts from industry experts
Gain a foundational understanding of a subject or tool
Develop job-relevant skills with hands-on projects
Earn a shareable career certificate

👁 Image

There are 5 modules in this course

Welcome to the Cloud Computing Applications course, the second part of a two-course series designed to give you a comprehensive view on the world of Cloud Computing and Big Data!

In this second course we continue Cloud Computing Applications by exploring how the Cloud opens up data analytics of huge volumes of data that are static or streamed at high velocity and represent an enormous variety of information. Cloud applications and data analytics represent a disruptive change in the ways that society is informed by, and uses information. We start the first week by introducing some major systems for data analysis including Spark and the major frameworks and distributions of analytics applications including Hortonworks, Cloudera, and MapR. By the middle of week one we introduce the HDFS distributed and robust file system that is used in many applications like Hadoop and finish week one by exploring the powerful MapReduce programming model and how distributed operating systems like YARN and Mesos support a flexible and scalable environment for Big Data analytics. In week two, our course introduces large scale data storage and the difficulties and problems of consensus in enormous stores that use quantities of processors, memories and disks. We discuss eventual consistency, ACID, and BASE and the consensus algorithms used in data centers including Paxos and Zookeeper. Our course presents Distributed Key-Value Stores and in memory databases like Redis used in data centers for performance. Next we present NOSQL Databases. We visit HBase, the scalable, low latency database that supports database operations in applications that use Hadoop. Then again we show how Spark SQL can program SQL queries on huge data. We finish up week two with a presentation on Distributed Publish/Subscribe systems using Kafka, a distributed log messaging system that is finding wide use in connecting Big Data and streaming applications together to form complex systems. Week three moves to fast data real-time streaming and introduces Storm technology that is used widely in industries such as Yahoo. We continue with Spark Streaming, Lambda and Kappa architectures, and a presentation of the Streaming Ecosystem. Week four focuses on Graph Processing, Machine Learning, and Deep Learning. We introduce the ideas of graph processing and present Pregel, Giraph, and Spark GraphX. Then we move to machine learning with examples from Mahout and Spark. Kmeans, Naive Bayes, and fpm are given as examples. Spark ML and Mllib continue the theme of programmability and application construction. The last topic we cover in week four introduces Deep Learning technologies including Theano, Tensor Flow, CNTK, MXnet, and Caffe on Spark.

You will become familiar with the course, your classmates, and our learning environment. The orientation will also help you obtain the technical skills required for the course.

What's included

1 video4 readings1 assignment1 discussion prompt1 plugin

1 video•Total 26 minutes

Welcome to Cloud Applications, Part 2!•26 minutes

4 readings•Total 40 minutes

Syllabus•10 minutes
About the Discussion Forums•10 minutes
Updating Your Profile•10 minutes
Social Media•10 minutes

1 assignment•Total 30 minutes

Orientation Quiz•30 minutes

1 discussion prompt•Total 60 minutes

Getting to Know Your Classmates•60 minutes

1 plugin•Total 15 minutes

Welcome! Please tell us about yourself.•15 minutes

In Module 1, we introduce you to the world of Big Data applications. We start by introducing you to Apache Spark, a common framework used for many different tasks throughout the course. We then introduce some Big Data distro packages, the HDFS file system, and finally the idea of batch-based Big Data processing using the MapReduce programming paradigm.

What's included

13 videos1 reading1 assignment

13 videos•Total 108 minutes

1.1.1 Motivation for Spark•9 minutes
1.1.2 Apache Spark•11 minutes
1.1.3 Spark Example: Log Mining•9 minutes
1.1.4 Spark Example: Logistic Regression•8 minutes
1.1.5 RDD Fault Tolerance•4 minutes
1.1.6 Interactive Spark•4 minutes
1.1.7 Spark Implementation•5 minutes
1.2.1 Introduction to Distros•3 minutes
1.2.2 Hortonworks•24 minutes
1.2.3 Cloudera CDH•3 minutes
1.2.4 MapR Distro•2 minutes
1.3.1 HDFS Introduction•15 minutes
1.3.2 YARN and MESOS•10 minutes

1 reading•Total 10 minutes

Module 1 Overview•10 minutes

1 assignment•Total 30 minutes

Module 1 Quiz•30 minutes

In this module, you will learn about large scale data storage technologies and frameworks. We start by exploring the challenges of storing large data in distributed systems. We then discuss in-memory key/value storage systems, NoSQL distributed databases, and distributed publish/subscribe queues.

What's included

24 videos1 reading1 assignment

24 videos•Total 303 minutes

Module 2 Introduction•6 minutes
2.1.1 Introduction to MapReduce with Spark•4 minutes
2.1.2 MapReduce: Motivation•16 minutes
2.1.3 MapReduce Programming Model with Spark•9 minutes
2.1.4 MapReduce Example: Word Count•10 minutes
2.1.5 MapReduce Example: Pi Estimation & Image Smoothing•15 minutes
2.1.6 MapReduce Example: Page Rank•14 minutes
2.1.7 MapReduce Summary•4 minutes
2.2.1 Eventual Consistency – Part 1•11 minutes
2.2.2 Eventual Consistency – Part 2•20 minutes
2.2.3 Consistency Trade-Offs•5 minutes
2.2.4 ACID and BASE•19 minutes
2.2.5 Zookeeper and Paxos: Introduction•11 minutes
2.2.6 Paxos•18 minutes
2.2.7 Zookeeper•16 minutes
2.3.1 Cassandra Introduction•27 minutes
2.3.2 Redis•7 minutes
2.3.3 Redis Demonstration•14 minutes
2.4.1 HBase Usage API•16 minutes
2.4.2 HBase Internals - Part 1•18 minutes
2.4.3 HBase Internals - Part 2•9 minutes
2.4.4 Spark SQL•8 minutes
2.5.5 Spark SQL Demo•9 minutes
2.5.1 Kafka•18 minutes

1 reading•Total 10 minutes

Module 2 Overview•10 minutes

1 assignment•Total 30 minutes

Module 2 Quiz•30 minutes

This module introduces you to real-time streaming systems, also known as Fast Data. We talk about Apache Storm in length, Apache Spark Streaming, and Lambda and Kappa architectures. Finally, we contrast all these technologies as a streaming ecosystem.

What's included

18 videos1 reading1 assignment

18 videos•Total 216 minutes

Module 3 Introduction•10 minutes
3.1.1 Streaming Introduction•10 minutes
3.1.2 "Big Data Pipelines: The Rise of Real-Time"•7 minutes
3.1.3 Storm Introduction: Protocol Buffers & Thrift•15 minutes
3.1.4 A Storm Word Count Example•3 minutes
3.1.5 Writing the Storm Word Count Example•11 minutes
3.1.6 Storm Usage at Yahoo•4 minutes
3.2.1 Anchoring and Spout Replay•17 minutes
3.2.2 Trident: Exactly Once Processing•10 minutes
3.3.1 Inside Apache Storm•9 minutes
3.3.2 The Structure of a Storm Cluster•4 minutes
3.3.3 Using Thrift in Storm•10 minutes
3.3.4 How Storm Schedulers Work•12 minutes
3.3.5 Scaling Storm to 4000 Nodes•14 minutes
3.3.6 Q&A with Bobby Evans (Yahoo) on Storm•33 minutes
3.4.1 Spark Streaming•18 minutes
3.4.2 Lambda and Kappa Architecture•5 minutes
3.4.3 Streaming Ecosystem•24 minutes

1 reading•Total 10 minutes

Module 3 Overview•10 minutes

1 assignment•Total 30 minutes

Module 3 Quiz•30 minutes

In this module, we discuss the applications of Big Data. In particular, we focus on two topics: graph processing, where massive graphs (such as the web graph) are processed for information, and machine learning, where massive amounts of data are used to train models such as clustering algorithms and frequent pattern mining. We also introduce you to deep learning, where large data sets are used to train neural networks with effective results.

What's included

18 videos1 reading1 assignment1 discussion prompt1 plugin

18 videos•Total 173 minutes

4.1.1 Graph Processing•23 minutes
4.1.2 Pregel - Part 1•7 minutes
4.1.3 Pregel - Part 2•11 minutes
4.1.4 Pregel - Part 3•6 minutes
4.1.5 Giraph Introduction•7 minutes
4.1.6 Giraph Example•5 minutes
4.1.7 Spark GraphX•15 minutes
4.2.1 Big Data Machine Learning Introduction•13 minutes
4.2.2 Mahout: Introduction•9 minutes
4.2.3 Mahout kmeans•5 minutes
4.2.4 Mahout: Naïve Bayes•9 minutes
4.2.5 Mahout: fpm•7 minutes
4.2.6 Spark Naïve Bayes•3 minutes
4.2.7 Spark fpm•3 minutes
4.2.8 Spark ML/MLlib•12 minutes
4.2.9 Introduction to Deep Learning•20 minutes
4.2.10 Deep Neural Network Systems•18 minutes
4.3.1 Closing Remarks•1 minute

1 reading•Total 10 minutes

Module 4 Overview•10 minutes

1 assignment•Total 30 minutes

Module 4 Quiz•30 minutes

1 discussion prompt•Total 30 minutes

Final Reflections•30 minutes

1 plugin•Total 15 minutes

How was the course?•15 minutes

Earn a career certificate

Add this credential to your LinkedIn profile, resume, or CV. Share it on social media and in your performance review.

Prepare for a degree

Taking this course by University of Illinois Urbana-Champaign may provide you with a preview of the topics, materials and instructors in a related degree program which can help you decide if the topic or university is right for you.

Instructors

Instructor ratings

4.8 (19 ratings)

👁 Reza Farivar

Reza Farivar

University of Illinois Urbana-Champaign

5 Courses•72,374 learners

👁 Roy H. Campbell

Roy H. Campbell

University of Illinois Urbana-Champaign

5 Courses•73,739 learners

Offered by

👁 Image

University of Illinois Urbana-Champaign

Explore more from Computer Security and Networks

👁 Image
U
University of Illinois Urbana-Champaign
Cloud Computing Concepts: Part 2
Course
👁 Image
U
University of Illinois Urbana-Champaign
Cloud Computing Project
Course
👁 Image
U
University of Illinois Urbana-Champaign
Cloud Computing Concepts, Part 1
Course
👁 Image
U
University of Illinois Urbana-Champaign
Cloud Computing Applications, Part 1: Cloud Systems and Infrastructure
Course

Why people choose Coursera for their career

👁 Image

Felipe M.

Learner since 2018

"To be able to take courses at my own pace and rhythm has been an amazing experience. I can learn whenever it fits my schedule and mood."

👁 Image

Jennifer J.

Learner since 2020

"I directly applied the concepts and skills I learned from my courses to an exciting new project at work."

👁 Image

Larry W.

Learner since 2021

"When I need courses on topics that my university doesn't offer, Coursera is one of the best places to go."

👁 Image

Chaitanya A.

"Learning isn't just about being better at your job: it's so much more than that. Coursera allows me to learn without limits."

Learner reviews

5 stars
54.51%
4 stars
28.86%
3 stars
11.37%
2 stars
3.20%
1 star
2.04%

Showing 3 of 343

Reviewed on Mar 18, 2018

Good overview and jumping off points to go explore more. Great that a lot of tool sets were exposed to us. A list of all these tool sets in a document would be handy.

Reviewed on May 22, 2020

Good learning about big data and real life scenarios esp. Yahoo.

Reviewed on Feb 22, 2020

There are a lot of technologies to cover and it is a dynamically changing subject. However, it will be great adding some hands-on exercises.

View more reviews

Frequently asked questions

To access the course materials, assignments and to earn a Certificate, you will need to purchase the Certificate experience when you enroll in a course. You can try a Free Trial instead, or apply for Financial Aid. The course may offer 'Full Course, No Certificate' instead. This option lets you see all course materials, submit required assessments, and get a final grade. This also means that you will not be able to purchase a Certificate experience.

When you enroll in the course, you get access to all of the courses in the Specialization, and you earn a certificate when you complete the work. Your electronic Certificate will be added to your Accomplishments page - from there, you can print your Certificate or add it to your LinkedIn profile.

Yes. In select learning programs, you can apply for financial aid or a scholarship if you can’t afford the enrollment fee. If fin aid or scholarship is available for your learning program selection, you’ll find a link to apply on the description page.

URL: https://www.coursera.org/learn/cloud-applications-part2