The Hadoop Ecosystem


Here's a more detailed outline of my talk on March 12. To make the talk more relevant to you, if you have a use case you'd like me to discuss, we'd love to hear about it, and possibly incorporate it into the talk. Join us for ... (see the end of this post).

If you came here looking for the presentation, here it is.

Introduction

  1. What Hadoop is, and what it's not
  2. Origins and History
  3. Hello Hadoop, how to get started.

The Hadoop Bestiary

  1. Core: Hadoop Map Reduce and Hadoop Distributed File System (HDFS)
  2. Data Access: HBase, Pig and Hive
  3. Algorithms: Mahout
  4. Data Import: Flume, Sqoop and Nutch

The Hadoop Providers

  1. Apache
  2. Cloudera
  3. What to do if your data is in a database

The Hadoop Alternatives

  1. Amazon EMR
  2. Google App Engine
For those that weren' t able to attend, here is the presentation:

Comments

blog comments powered by Disqus