Contact Triangle Hadoop Users Group for event and ticket information.

This event has ended!

View current events hosted by Triangle Hadoop Users Group

TriHUG November Mtg. Featuring Alan Gates of Hortonworks

Tuesday, November 15, 2011 from 6:30 PM to 9:00 PM (ET)

Durham, NC

Ticket Information

Type End     Quantity
TriHUG Ended Free  
Share this!

Event Details

Title:  New Features in Pig 0.9 and  Introducing HCatalog

 

Abstract:  Pig 0.9 added several features to make Pig a more powerful data processing platform, including macros, include statements, and the ability to embed Pig in Python for control flow.  We'll cover these, talk about some new features that have been added since 0.9, and what's next on Pig's roadmap.

 

HCatalog is a table management and storage management layer for Hadoop that enables users with different data processing tools – Pig, MapReduce, Hive, Streaming – to more easily read and write data on the grid. HCatalog’s table abstraction presents users with a relational view of data in the Hadoop distributed file system (HDFS) and ensures that users need not worry about where or in what format their data is stored – RCFile format, text files, sequence files.  This talk will include an overview of HCatalog's features and a discussion of its current roadmap.

 

Bio:  Alan is an original member of the engineering team that took Pig from a Yahoo! Labs research project to a successful Apache open source project. Alan also designed HCatalog and guided its adoption as an Apache Incubator project. Alan has a BS in Mathematics from Oregon State University and a MA in Theology from Fuller Theological Seminary. He is also the author of Programming Pig, a forthcoming book from O’Reilly Press. Follow Alan on Twitter: @alanfgates.