Good Gear Guide
In Pictures: 18 essential Hadoop tools for crunchi...
The future is already coming. For some algorithms, Hadoop can be slow because it generally relies on data stored on disk. That's acceptable when you're processing log files that are only read once, but all of that loading can be a slog when you're accessing data again and again, as is common in some artificial intelligence programs. Spark is the next generation. It works like Hadoop but with data that's cached in memory. The illustration at left, from Apache's documentation, shows just how much faster it can run in the right situations.
Spark is being incubated by Apache and is available from http://spark.incubator.apache.org/.
In Pictures: 18 essential Hadoop tools for crunching big data