Lifelong LearningComputational and Data-Enabled
Science and Engineering Coding Tutorials and Examples
Hadoop & Spark
I'm far from a Hadoop or Spark guru, but I took a course that used both, and
I've decided to collect here the resources I found that are publicly
available to help me through assignments in that course. I haven't done Java
in quite a while, so the coding language we used was Python, and you'll see
that reflected in most of the links below. Some resources do include coding
information on Java, Scala or R.
Links
Mapreduce In Hadoop
(video)
MapReduce Tutorial For
Beginners (video)
MapReduce Tutorial
(video)
Hadoop MapReduce
Example (video)
Python Hadoop Tutorial
for Beginners (video)
Python Hadoop Tutorial
for Beginners (2) (video)
Inverted Index
MapReduce Use Case (video)
Learn By Example: Hadoop, MapReduce for Big Data problems
Inverted Index
Creation with example (video)
Writing An Hadoop MapReduce Program In Python
Using Python and Hadoop streaming to build an inverted index
Spark Tutorial For
Beginners (video)
Not Your Father’s Database
What Is Apache Spark?
(video)
Apache Spark Crash
Course (video)
PySpark Tutorial for
Beginners (video)
Apache Spark With
Python Tutorial (video)
What is PySpark?
PySpark Dataframes
Tutorial (video)
PySpark Concepts with
Hands-On (video)
Social Network Analysis
Data Science with
Spark : Analyzing Free Text from the Tweets (video)
Using TF-IDF to
convert unstructured text to useful features (video)
PySpark MLlib Tutorial
(video)
Classification and regression
Natural Language
Processing with PySpark (video)
Spark ML. Feature Engineering for Texts, part 2 (video)
Sentiment Analysis
with H2O, PySpark and Word2Vec on Qubole (video)
Deep Learning for Natural Language Processing Using Apache Spark and
TensorFlow
Glint: An Asynchronous Parameter Server for Spark
Feature Extraction and Transformation - RDD-based API
PySpark: CountVectorizer|HashingTF
Spark MLlib TF-IDF – Example
|