Skip to main content

Posts

Featured

Learn How To Process Data Interactively And In Batch Using Apache Tez Framework.

Within Hadoop, MapReduce has been the widely used approach to process data. In this approach data processing happens in batch mode that can take minutes, hours or days to get results. MapReduce is useful when waiting for a long period for query results is not problematic. However when you need to get query results in a few seconds such a data processing model is no longer useful. Apache Tez is a project in the Hadoop ecosystem that was developed to address the need for interactive data processing. The project began incubation in February of 2013 and became a top level project in July of 2014. By using Tez as the data processing framework performance gains of up to 3 times over MapReduce are achievable. Apache Hive and Apache Pig are two of projects in Hadoop that have benefited greatly from performance gains offered by Tez. Tez is not intended to be used directly by users but as an underlying framework to enable application development. Tez simplifies data processing by enabling data p…

Latest posts