Big Data: Hadoop 2.0's YARN Framework enables Real-Time Data Analysis

By Dick Weisinger

Hadoop was originally developed primarily as a web crawler for batch indexing the content of web sites. Big Data analytics applications have adopted Hadoop, but because of its batch nature of the MapReduce algorithm, it isn’t good at interactively querying and analyzing real-time data streams.

Hadoop 2.0 attempts to improve and expand on the types of applications to which it can be applied. The new framework in Hadoop 2.0 is called YARN (acronym for “Yet Another Resource Negotiator” — or MapReduce 2.0). It’s more generic than MapReduce and is capable of handling live streams of data and interactive queries.

Arun Murthy, founder of Hortonworks, said that “the power of YARN to enable applications to run ‘in’ Hadoop, instead of ‘on’ Hadoop, is the key to leveraging all other common services of the next-generation data platform, from security to data lifecycle management… You can now have both the batch MapReduce jobs and interactive SQL queries running right next to each other in YARN.”

Shaun Connolly, Hortonworks vice president of corporate strategy for Hortonworks, said that “YARN creates a cluster that is aware of all the different types of workloads and resource needs, so they can all cohabitate. You don’t get one workload dominating or taking over all the resources of the cluster.”

YARN extracts the resource management capability that were previously was embedded in MapReduce and re-packages it in a way so that it can be used by new engines. With the new YARN framework, multiple applications can run simultaneously in Hadoop, all going through YARN as the common resource manager.

The new Hadoop 2.0 YARN framework enables:

Greater scalability
Improved cluster utilization
Support for workloads other than MapReduce
Agility

August 21st, 2013

Category: Big Data

Leave a Reply Cancel reply

Legal Terms & Disclaimers

This blog site is accessed from the website of Formtek, Inc. All visitors to or users of this blog site are subject to the terms and conditions and privacy policy that govern the Formtek website, links for which are provided above.

Some of the individuals posting to this blog site, including the moderators, work for Formtek. Postings by these individuals are the personal opinions of these individuals, not of Formtek. Their posted content is provided for informational purposes only and is not meant to be an endorsement or representation by Formtek or any other party. Postings to this blog site may be outdated, invalid or inaccurate by the time you read them. Individuals posting to this blog site make no statements, representations or warranties as to the timing, validity, accuracy or reliability of their postings.

This blog site may contain links to third party sites. Access to any third party site linked to this blog site is at your own risk. None of Formtek, the blog site moderator(s) and the individuals posting on this blog site that work for Formtek is responsible for the timing, validity, accuracy or reliability of any information, data, opinions, advice or statements made on these third party sites. These links are provided merely as a convenience and do not imply any endorsement.

Postings to this blog site are available to the public. You should not post, link to or otherwise upload any information considered confidential to this blog site. All postings to this blog site are moderated. Postings will appear if and when they are approved by the moderator. Notwithstanding any approval by the moderator, by posting information to this blog site, you agree to be solely responsible for the information you post, link to, or otherwise upload to the blog site. You agree to release Formtek from any liability related to that information or to your use of the blog site. You grant Formtek a worldwide, perpetual, irrevocable, royalty-free, fully-paid, and transferable (including rights to sublicense) right to exercise all copyright, publicity, and moral rights with respect to any information you post, link to or otherwise upload to this blog site.

Big Data: Hadoop 2.0's YARN Framework enables Real-Time Data Analysis

Leave a Reply Cancel reply

Company

Products and Services

News

Resources

Legal Terms & Disclaimers