Big Data Tech 2016 has ended
Sessions are categorized to help you find the technical level you are looking for. Mild sessions are less technical, while spicy sessions are very technical. 
Back To Schedule
Tuesday, June 7 • 11:45am - 12:30pm
Unlocking the True Value of Hadoop with Open Data Science

Log in to save this to your schedule, view media, leave feedback and see who's attending!

The high potential of Hadoop is clear across all verticals and the industry, yet organizations still struggle to unlock its true value. Even with the latest execution engines like Spark, they are challenged with leveraging the data, advanced analytics and computing power in their clusters. Organizations need to enable data scientists and analysts to leverage existing analytics packages in a Hadoop environment or perform distributed computations using tools in Open Data Science, including the Python and R ecosystems. Enterprises demand flexibility, high performance and efficient use of memory to scale up their Big Data workloads, especially for numerical and statistical computations. They need cost effective business results from their Big Data investments and the ability to leverage the latest innovations to outperform their last generation technology. 

In order to deliver on this promise, Hadoop has to be simplified so that the skills that exist in the enterprise can easily maximize the benefits of Big Data. This requires scalable package and dependency management of existing analytics tools as well as flexible parallel frameworks to scale up their Big Data workloads, especially for heavy duty machine learning. 

In this session, enterprises will learn how to leverage the power of Open Data Science to extract value and get high performance and interactive analytics from Hadoop. The speaker will demonstrate examples that include distributed natural language processing, Distributed image processing with GPUs and distributed SQL queries. Data Scientists will hear about how to achieve lightning fast processing of computationally intensive distributed analytics on Hadoop, including Python and R, to realize the full value of their Big Data.

avatar for Kristopher Overholt

Kristopher Overholt

Solution Architect, Continuum Analytics
Kristopher Overholt is a solution architect at Continuum Analytics who works with scientific software development and distributed/cluster computing, including Python, Hadoop and Spark for data analysis and data engineering workflows. Kristopher received a Ph.D. in Civil Engineering... Read More →

Tuesday June 7, 2016 11:45am - 12:30pm CDT
Auditorium (Fine Arts F2265) 9700 France Ave S, Bloomington, MN 55431