Loading…
Big Data Tech 2016 has ended
Sessions are categorized to help you find the technical level you are looking for. Mild sessions are less technical, while spicy sessions are very technical. 

Log in to bookmark your favorites and sync them to your phone or calendar.

Tuesday, June 7
 

9:00am

Building A Stream Data Platform For Everyone: Apache Kafka and the Confluent Platform
As software systems grow and diversify, making sure every system and stakeholder has access to the data they need becomes a big obstacle to truly reaping the benefits of diverse, copious data. We'll discuss techniques we've explored at GovDelivery for maturing the mishmash of disparate systems, APIs, and data formats growing organizations are likely to accumulate into a robust, stream-centric microservices architecture combining the OSS messaging system and data management components that comprise Apache Kafka and the Confluent Platform with the tools and processes such organizations already know and use.

Speakers
avatar for Benjamin Ortega

Benjamin Ortega

Software Architect, GovDelivery


Tuesday June 7, 2016 9:00am - 9:45am
P0806 (1st floor) 9700 France Ave S, Bloomington, MN 55431

9:00am

Free your Data (and your Analysts): The Keys to Becoming an Analytics-Driven Organization
Speakers
avatar for Dirk DeRoos

Dirk DeRoos

IBM Worldwide Analytics Platform Technical Leader, IBM


Tuesday June 7, 2016 9:00am - 9:45am
Garden Room (1st floor) 9700 France Ave S, Bloomington, MN 55431

9:00am

BASH Through the Barriers: Moderating the Clinical Research and Computational Divide
Free text. Incomplete data. Neurosurgery. Acronyms. Databases. Inaccurate data. Physician interpretations. Dual located data. Is it possible to create an architecture that fits these items together resourcefully? Can clinicians and computationalists harmonize? At the Brain Injury Research Lab, we are BASHing through these barriers. The Brain Injury Assessment Study at Hennepin County Medical Center (BASH) aims to develop a multimodal assessment of brain injury. Using BASH as a technical case study, attendees will learn about three main challenges faced in clinical research, namely, how data collection in medical research is not “controlled”, problems faced when storing and managing protected health information, and incorporating the valuable knowledgebase but differential skill level of study personnel in data processing and analysis. How BASH tackled these challenges (or plans to) will be explained. Lastly, the talk will describe futuristic solutions to the challenges of clinical research using alternative architectures that maintain discipline-specific components.

Speakers
avatar for Margaret Mahan

Margaret Mahan

Scientist, Minneapolis Medical Research Foundation


Tuesday June 7, 2016 9:00am - 9:45am
P0808 (1st floor) 9700 France Ave S, Bloomington, MN 55431

9:00am

Conference Kickoff and Keynote
Speakers
avatar for SriSatish Ambati

SriSatish Ambati

Co-founder and CEO, H20.ai


Tuesday June 7, 2016 9:00am - 9:45am
Auditorium (Fine Arts F2265) 9700 France Ave S, Bloomington, MN 55431

9:00am

HDFS/Hadoop Training
In this talk, attendees will gain experience working with Hadoop in a hands-on manner. They will interact with HDFS, MapReduce, and Hive to perform ETL and analytics. Students are not expected to have any hands on experience with Hadoop. As a hands on session there will be minimal lecture. Each student will have the option of using either a virtual machine or a cluster in the cloud.

Speakers
avatar for Brock Noland

Brock Noland

Chief Architect and Co-founder, phData, Inc


Tuesday June 7, 2016 9:00am - 10:30am
Lab (2nd floor) 9700 France Ave S, Bloomington, MN 55431

9:00am

Free LinkedIn Portraits
Stop in for a FREE professional LinkedIn profile portrait compliments of MinneAnalytics and Avanti Photography.

Tuesday June 7, 2016 9:00am - 12:00pm
TBA

10:00am

R Studio Server on Amazon EMR
Explore the convenience of the popular IDE for R while harnessing the power of SparkR (R on Spark) for distributed processing. See how to quickly set up R Studio Server on an EMR cluster and access the IDE via any web browser.  Keywords: R Studio, EMR, Distributed Data Frames, Machine Learning, SQL Context, fast aggregation.

Speakers
avatar for Chad Dvoracek

Chad Dvoracek

Data Engineer, The Nerdery
Inspired by an innate curiosity to solve complex problems, Chad was drawn to the potential of big data and emerging technologies. His first role in the field was in healthcare where he utilized distributed systems to automate and prepare data for an analytics team. In June 2015 an... Read More →


Tuesday June 7, 2016 10:00am - 10:30am
P0838 (1st floor) 9700 France Ave S, Bloomington, MN 55431

10:00am

High Volume Streaming Data: How Amazon Web Services is Changing Our Approach

Technologies for the capture and analysis of streaming data has changed over the years and cloud technologies have taken us to a new level.  Many people are not aware of the new technologies and architectural paradigms that are available today for near-real-time capture and analysis of high-volume data.

This presentation will examine Amazon Web Services’ offerings for streaming data analysis, compare how it’s changed over the years, and take a look at what might be coming in the future.  Real-life case-studies and architectures will be shared to demonstrate how these technologies can, and have been, used to successfully meet customer needs.


Speakers
avatar for Michael Krouze

Michael Krouze

CTO, Charter Solutions
Michael has a broad and deep knowledge of the Information Technology arena from both a technology and business perspective. He is an innovative thinker adept at planning and initiating change, a superb strategist with a solid combination of business and technical acumen, and a hands-on... Read More →


Tuesday June 7, 2016 10:00am - 10:30am
P0806 (1st floor) 9700 France Ave S, Bloomington, MN 55431

10:00am

A Bayesian Approach to Predictive Pricing
This talk will present how to use Bayesian modeling to predict the market value of collectibles based on recent sales.  It does so in an MCMC framework and provides an answer for how to predict the sale price of items that are rarely (or never) sold at auction.

Speakers
avatar for Josh Cutler

Josh Cutler

Distinguished Engineer, Optum


Tuesday June 7, 2016 10:00am - 10:30am
Garden Room (1st floor) 9700 France Ave S, Bloomington, MN 55431

10:00am

Data Warehousing the Easy Way with AWS Redshift
Data warehouses have commonly been built on relational databases which are optimized for transactional work loads. The performance and storage limitations of these systems required special data models such as star schemas. The latest wave of column-based data stores such as AWS Redshift are optimized for analytical work loads and no longer require these compromises to perform. Redshift also has an extensive group of partners which provide solutions to ease the loading, transformation and visualization of the source data.
This presentation will examine a case study of how Field Nation built a data warehouse and financial dashboards in three months using Redshift. We will look at the tools in the data pipeline from data extraction to delivery. By the end of the presentation you will have a new understanding of how to turn both in-house and SaaS provider data into a coherent whole for your users.

Speakers
avatar for Eric Ness

Eric Ness

Senior Data Scientist, Field Nation



Tuesday June 7, 2016 10:00am - 10:30am
P0808 (1st floor) 9700 France Ave S, Bloomington, MN 55431

10:00am

Driving Data Initiatives (Even when you're not in charge)
Many analysts and data scientists are frustrated when their ideas, visions, and solutions are not adopted by organizations. Two related barriers contribute to this situation: a lack of systematic processes around data analytics, and cultures which are not data-driven. In this mini-“workshop” we will determine next actions to drive organizations toward analytic leadership. This event is targeted toward data champions – wherever they exist in the organization. Entrepreneurs and solution providers who champion analytics will also benefit from the workshop.

Speakers
avatar for Ryan Sougstad

Ryan Sougstad

Associate Professor, Augustana University


Tuesday June 7, 2016 10:00am - 10:30am
P1808 (2nd floor) 9700 France Ave S, Bloomington, MN 55431

10:00am

Start with a Question
Explore how to make the most of your big data expertise by turning nebulous business needs into operational research questions. Learn how to ask questions that are specific, covered by your data, and whose answers will deliver business value. This talk will focus on how everyone from engineers to executives can get the results they want by asking better questions.

Speakers
avatar for Drew Barwis

Drew Barwis

Application Architect, Dow Jones


Tuesday June 7, 2016 10:00am - 10:30am
P1838 (second floor) 9700 France Ave S, Bloomington, MN 55431

10:00am

The Journey to a Modern Data-to-Insight Pipeline
Getting data ready for use, whether for self-service analytics, building data products or any other data use case, shouldn’t be hard. Yet, traditional tools and processes require an overwhelming amount of time (and resourcing) to get through the murky realm between raw data and usable information.Ideally, the preparation process should be as fluid and interactive as possible, allowing everyone who needs information to access, clean, organize and transform data as quickly as they need it and with the context that makes sense for their use case.

Speakers
avatar for Nenshad Bardoliwalla

Nenshad Bardoliwalla

Chief Product Officer and Co-founder, Paxata


Tuesday June 7, 2016 10:00am - 10:30am
Auditorium (Fine Arts F2265) 9700 France Ave S, Bloomington, MN 55431

10:45am

Big Data, Big Event: The past and future of MinneAnalytics
Featuring members of MinneAnalytics' leadership.

Tuesday June 7, 2016 10:45am - 11:30am
P1838 (second floor) 9700 France Ave S, Bloomington, MN 55431

10:45am

Data Science is Getting Easier…Maybe?
Data Science is no longer the new buzz word without a definition. The Wikipedia entry for ‘data science’ has been available for 2 years. Organizations are moving beyond just the hype of data science and big data, and now they are onto the process of implementation. It can be difficult to know the first step when implementing data science. This talk will cover a process for getting your organization started with data science. It will then discuss how open source and cloud-based tools are making the implementation cheaper and easier than ever. The catch is: while data science is becoming easier to implement, the type of problems being attacked are growing in complexity.

Speakers
avatar for Ryan Swanstrom

Ryan Swanstrom

Advanced Analytics Trainer, Microsoft


Tuesday June 7, 2016 10:45am - 11:30am
Garden Room (1st floor) 9700 France Ave S, Bloomington, MN 55431

10:45am

Graph Database with Cassandra
Discover the benefits of the up and coming database technology that uses graph structures and semantic queries for storing and searching data. Learn how to set up Titan with Cassandra as the storage backend, load data, run queries and perform analytics natively and with Apache Spark.  Keywords: Graph Database, NoSQL, analytics, aggregation.

Speakers
avatar for Brandon Veber

Brandon Veber

Software Engineer, The Nerdery
A strong interest in math and science led Brandon to the engineering field.  He was first trained as a Mechanical Engineer at Purdue University before specializing in Machine Learning while earning a M.S. degree from the school of Electrical and Computer Engineering at the University... Read More →



Tuesday June 7, 2016 10:45am - 11:30am
P0838 (1st floor) 9700 France Ave S, Bloomington, MN 55431

10:45am

Learning from History's Largest Source of Knowledge: The Promise and Pitfalls of Mining Wikipedia
Wikipedia offers 279 different language editions ranging from English and Estonian to Swahili and Scots. This talk showcases the algorithmic opportunities of this amazing source of knowledge and touches on its potentially dangerous limitations.  We will discuss two new algorithms that extend the groundbreaking word2vec neural network. The first produces remarkably accurate vector representations of over 50 million words in hundreds of different languages. The second leverages simple geographic features to accurately infer the relationships between geographic places. Finally, we'll discuss big-data studies that surface geographic and gender inequalities within Wikipedia, and explore the effects of this bias on algorithms that learn from Wikipedia.

Speakers
avatar for Shilad Sen

Shilad Sen

Associate Professor, Macalester College
Shilad Sen is an Associate Professor of Computer Science at Macalester College in St. Paul, MN and a data science research fellow for Target Corporation. He studies the relationship between algorithms, software, and people and focuses on biases and inequalities along dimensions such... Read More →


Tuesday June 7, 2016 10:45am - 11:30am
P0806 (1st floor) 9700 France Ave S, Bloomington, MN 55431

10:45am

Function Over Framework: Mapreduce, Tez and Spark - How to Select What Jobs to Run Where
Over the 7 years of its life as the market leader in enabling business users to design Hadoop based analytics, Datameer has seen patterns emerge across a wide array of customers and verticals about how users interact with their data, how datasets interact with one another and learned how best to approach them as frameworks have come into and out of vogue. Adam Gugliciello, Datameer's lead solution architect will share some of those learnings and how it led to the development of Datameer's Smart Execution framework, and what triggers you might use to select a framework or frameworks to solve your problems most efficiently.

Speakers
avatar for Adam Gugliciello

Adam Gugliciello

Lead Architect, Datameer


Tuesday June 7, 2016 10:45am - 11:30am
P0808 (1st floor) 9700 France Ave S, Bloomington, MN 55431

10:45am

Enabling Big Data Research at the Minnesota Supercomputing Institute
Speakers
avatar for Jeff McDonald MSI

Jeff McDonald MSI

Assistant Director of HPC Operations, University of Minnesota


Tuesday June 7, 2016 10:45am - 11:30am
Auditorium (Fine Arts F2265) 9700 France Ave S, Bloomington, MN 55431

10:45am

Exploratory Data Analysis and Tag Identification with LDA model for Stack Overflow Data
Learn to use EDA and statistical techniques to discover key characteristics and the underlying structure of stackoverflow data. The latent Dirichlet allocation (LDA) models helps to identify mixtures of topics with certain probabilities. See how using the LDA algorithm can identify potential tags based on a question posted by a user.

Speakers
avatar for Sona Maniyan

Sona Maniyan

Sr BI Analyst, Target
avatar for Ramesh Ramanujam

Ramesh Ramanujam

Senior Consultant, BestBuy


Tuesday June 7, 2016 10:45am - 11:30am
P1808 (2nd floor) 9700 France Ave S, Bloomington, MN 55431

10:45am

Spark Training
In this talk, attendees will gain experience working with Spark in a hands-on manner. In a directed fashion, they will build Spark programs for ETL and analytics. Students are expected to have at least a high level understanding of Hadoop, Spark, or both. As a hands on session there will be minimal lecture. Each student will have the option of using either a virtual machine or a cluster in the cloud.

Speakers
avatar for Brock Noland

Brock Noland

Chief Architect and Co-founder, phData, Inc


Tuesday June 7, 2016 10:45am - 12:15pm
Lab (2nd floor) 9700 France Ave S, Bloomington, MN 55431

10:45am

Watson Analytics: Your Data Scientist in a Box
Speakers
avatar for Nick Acosta

Nick Acosta

Developer Advocate, IBM
Before becoming a Developer Advocate at IBM, Nick studied computer science at Purdue University and the University of Southern California, and was a high performance computing consultant for Hewlett-Packard in Grenoble, France. He now specializes in machine learning and interacting... Read More →
avatar for Zach Taylor

Zach Taylor

Watson Analytics Specialist, IBM


Tuesday June 7, 2016 10:45am - 1:45pm
Room K1450 (2nd floor) 9700 France Ave S, Bloomington, MN 55431

11:45am

Detecting Bad Data with Zeppelin
Speakers
avatar for Frank Schilder

Frank Schilder

Research Director, Thomson Reuters

Sponsors

Tuesday June 7, 2016 11:45am - 12:15pm
P1838 (second floor) 9700 France Ave S, Bloomington, MN 55431

11:45am

AOL's High Throughput Big Data Ecosystem
Find out how AOL processes up to a million events per second and analyzes them in a combination of streaming and batch systems.

Speakers
avatar for Benjamin Jackson

Benjamin Jackson

Principal Software Architect, AOL Platforms


Tuesday June 7, 2016 11:45am - 12:15pm
P0838 (1st floor) 9700 France Ave S, Bloomington, MN 55431

11:45am

Open Data Science and the Python Ecosystems
The presentation will provide a short overview of the python environment and the well-known python libraries from Data Munging and visualization to Machine learning. IPython / Jupyter notebook will be used to run Real-time demos targeting the SciPy Python-based ecosystem of open-source software for mathematics, science, and engineering. Apache Zeppelin notebook will be demo as well including PySpark / ML pipeline .

Speakers
avatar for Jean Marie Bertoncelli

Jean Marie Bertoncelli

Consulting Data Scientist, Voya


Tuesday June 7, 2016 11:45am - 12:15pm
Garden Room (1st floor) 9700 France Ave S, Bloomington, MN 55431

11:45am

Making Anyone a Hadoop User: Full-text Search with Apache Solr and Hadoop Made Easy
Apache Hadoop has grown from distributed storage and compute platform into a vast ecosystem of tools that give organizations leverage to deal with big data. In this use-case focused talk, Mac Noland and Karl Lacher will discuss how Apache Solr on Apache Hadoop has opened the doors for deeper insight into structured and unstructured data.

Speakers
avatar for Karl Lacher

Karl Lacher

Solutions Architect, phData.io
avatar for Mac Noland

Mac Noland

Solutions Architect, phData.io


Tuesday June 7, 2016 11:45am - 12:15pm
P0806 (1st floor) 9700 France Ave S, Bloomington, MN 55431

11:45am

Taming the Elephant: A General Mills Tale
From Source Control to Governance, this presentation with discuss what it takes to operationalize a production use case within the Hadoop ecosystem. This discussion includes best practices and factors to consider throughout the planning, implementation and execution stages of rolling out a Hadoop Cluster as well as pitfalls to avoid. Whether you are looking to implement your own or are trying to revamp an existing one, this presentation will give valuable knowledge about building and maintaining an operational Hadoop Cluster.

Speakers
avatar for Liz Carroll-Anderson

Liz Carroll-Anderson

Data Platform Engineer, General Mills


Tuesday June 7, 2016 11:45am - 12:15pm
P0808 (1st floor) 9700 France Ave S, Bloomington, MN 55431

11:45am

Turning more than 80 Million Rows of Data Into Easy-to-use Insights with Spotfire, Python & SQL Server
Speakers
avatar for Shihe Ma, MS, MBA

Shihe Ma, MS, MBA

Senior Database Marketing Manager, Medtronic


Tuesday June 7, 2016 11:45am - 12:15pm
P1808 (2nd floor) 9700 France Ave S, Bloomington, MN 55431

11:45am

Unlocking the True Value of Hadoop with Open Data Science
The high potential of Hadoop is clear across all verticals and the industry, yet organizations still struggle to unlock its true value. Even with the latest execution engines like Spark, they are challenged with leveraging the data, advanced analytics and computing power in their clusters. Organizations need to enable data scientists and analysts to leverage existing analytics packages in a Hadoop environment or perform distributed computations using tools in Open Data Science, including the Python and R ecosystems. Enterprises demand flexibility, high performance and efficient use of memory to scale up their Big Data workloads, especially for numerical and statistical computations. They need cost effective business results from their Big Data investments and the ability to leverage the latest innovations to outperform their last generation technology. 

In order to deliver on this promise, Hadoop has to be simplified so that the skills that exist in the enterprise can easily maximize the benefits of Big Data. This requires scalable package and dependency management of existing analytics tools as well as flexible parallel frameworks to scale up their Big Data workloads, especially for heavy duty machine learning. 

In this session, enterprises will learn how to leverage the power of Open Data Science to extract value and get high performance and interactive analytics from Hadoop. The speaker will demonstrate examples that include distributed natural language processing, Distributed image processing with GPUs and distributed SQL queries. Data Scientists will hear about how to achieve lightning fast processing of computationally intensive distributed analytics on Hadoop, including Python and R, to realize the full value of their Big Data.

Speakers
avatar for Kristopher Overholt

Kristopher Overholt

Solution Architect, Continuum Analytics
Kristopher Overholt is a solution architect at Continuum Analytics who works with scientific software development and distributed/cluster computing, including Python, Hadoop and Spark for data analysis and data engineering workflows. Kristopher received a Ph.D. in Civil Engineering... Read More →



Tuesday June 7, 2016 11:45am - 12:30pm
Auditorium (Fine Arts F2265) 9700 France Ave S, Bloomington, MN 55431

12:30pm

Analytics for Impatient People
Speakers
avatar for Ryan Hedlund

Ryan Hedlund

Solution Architect, Interana
avatar for Joe Kelleher

Joe Kelleher

Account Executive, Interana, Inc.


Tuesday June 7, 2016 12:30pm - 1:00pm
P0806 (1st floor) 9700 France Ave S, Bloomington, MN 55431

12:30pm

Connected Cars, Gaming, and Oil Rigs - A Fast Connected World with the Cassandra Database
In this talk, Mitch will walk through real world IoT use cases that we've built solutions around and show a demo of one such solution. In doing so, she will break down what makes a database purpose built for IoT applications and how attendees can get started on building their own.

Speakers
avatar for Mitch Henderson

Mitch Henderson

Solutions Engineer, DataStax


Tuesday June 7, 2016 12:30pm - 1:00pm
P0808 (1st floor) 9700 France Ave S, Bloomington, MN 55431

12:30pm

Lunch with Data Science Teachers and Learners
Tuesday June 7, 2016 12:30pm - 1:00pm
Garden Room (1st floor) 9700 France Ave S, Bloomington, MN 55431

12:30pm

Lunch with Job Seekers and Hiring Managers
Tuesday June 7, 2016 12:30pm - 1:00pm
P1808 (2nd floor) 9700 France Ave S, Bloomington, MN 55431

1:15pm

Data Wrangling on Hadoop – Overcoming Enterprise Challenges
Data wrangling has emerged as a fast-growing technology space over the past year but many businesses are still trying to decipher how it fits into their overall analytics or big data environment. In this presentation, we’ll discuss how data wrangling solutions can be leveraged to improve the efficiency of existing analytics processes and successfully execute new analytics initiatives. The talk will include examples from Trifacta customers including PepsiCo and Royal Bank of Scotland.

Speakers
avatar for Sean Kandel

Sean Kandel

CTO and Co-founder, Trifacta
Sean is Trifacta’s Chief Technical Officer and Co-founder. He completed his Ph.D. at Stanford University, where his research focused on user interfaces for database systems. At Stanford, Sean led development of new tools for data transformation and discovery, such as Data Wrangler... Read More →


Tuesday June 7, 2016 1:15pm - 1:45pm
P0838 (1st floor) 9700 France Ave S, Bloomington, MN 55431

1:15pm

Market Competitiveness Scoring
Thomson Reuters’ FindLaw business knows it is important to show our customers a strong return on their investment and data is the catalyst to make that happen. Building out a big data platform and adding analytics and tools for visualizing this information is making this important business function real. Hear how FindLaw is aggregating millions of industry SEO benchmarks like Moz, Majestic, web analytics, and other various media attributes to show key market characteristics.

Speakers
avatar for Lisa Schlosser

Lisa Schlosser

CTO FindLaw, Thomson Reuters


Tuesday June 7, 2016 1:15pm - 1:45pm
Auditorium (Fine Arts F2265) 9700 France Ave S, Bloomington, MN 55431

1:15pm

Master Data - The Foundation to Big Data
How do you maximize your Big Data? As our data sets grow and data science becomes more important in getting the right information at the right time, it becomes vital to ensure you have sound foundational data. This hinges on the quality of your Master Data.
You cannot know your consumer well if your Master Data isn’t of good quality, to know your consumer is to know your data. You cannot begin to connect your data without high quality Master Data.

What tool does General Mills use to aid in this effort? Information Steward. Information Steward gives us the means to monitor, visualize and identify Master Data issues in order to bring us closer to connecting our data together effectively. Come find more about why and how General Mills monitors data quality.

Speakers
avatar for Kristine Rennaker

Kristine Rennaker

SAP Master Data Analyst, General Mills


Tuesday June 7, 2016 1:15pm - 1:45pm
P0808 (1st floor) 9700 France Ave S, Bloomington, MN 55431

1:15pm

Data Analysis Using Spark and Cloudera Impala
Take an 8 hour query to 1.5 minutes with Hadoop and Impala.

Speakers
avatar for Kasaba Bharathi

Kasaba Bharathi

Hadoop Engineer, Medtronic


Tuesday June 7, 2016 1:15pm - 1:45pm
Garden Room (1st floor) 9700 France Ave S, Bloomington, MN 55431

1:15pm

Text Convolution Neural Network (CNN) with TensorFlow & GPUs
Train models and save the earth at the same time! This talk examines the Graphic Processor Unit’s (GPU) potential and limitations for machine learning. We will evaluate GPUs against a business case, automated topic models. We found that GPU-based topic models reduced costs by at least 98% and execution time by 91% compared to single machine and Apache Spark alternatives. The talk also surveys GPU-based machine learning libraries and covers methods to scale beyond a single GPU.

Speakers
avatar for John Hudzina, PhD

John Hudzina, PhD

Data and Analytics Architect, Thomson Reuters


Tuesday June 7, 2016 1:15pm - 1:45pm
P0806 (1st floor) 9700 France Ave S, Bloomington, MN 55431

1:15pm

Python Training -- Open Data Science with Anaconda: A Dozen of the Top Python Libraries in 90 Minutes.
Bring your laptop and make sure you download Anaconda before arriving: https://www.continuum.io/downloads! For any trainees that have not pre-downloaded Anaconda, Ian will provide the above link at the beginning of the session.

Speakers
avatar for Ian Stokes-Rees

Ian Stokes-Rees

Computational Scientist, Continuum Analytics
Ian is a computational scientist at Continuum Analytics, has been with the company since the beginning and is the team lead for the Collaboration component of Anaconda Enterprise. One of Ian’s key focus areas include working with Continuum clients to leverage Open Data Science to... Read More →


Tuesday June 7, 2016 1:15pm - 2:45pm
P1808 (2nd floor) 9700 France Ave S, Bloomington, MN 55431

2:00pm

Developing Fast and Scalable Customer Intelligence Solutions on AWS Redshift
Traditional business intelligence is dead. Regardless of whether your company sells to consumers or businesses, the decision making process at all levels of the enterprise is becoming increasingly dependent on insights derived from behavioral data generated by your customers. This dependency on customer intelligence requires a new set of “BI” tools and technologies capable of not only storing and processing very large amounts of data but also the ability to analyze such data quickly and at scale. Many of the existing business intelligence solutions have two key drawbacks when it comes to addressing these new requirements: they can be cost prohibitive for many organizations if analysis of large datasets is required; and more importantly, they often require data pre-aggregation to achieve a desirable speed to insights. The Amazon Web Services (AWS) ecosystem, with AWS Redshift in particular, is designed to address both of these challenges. In this session, Jakub will describe how organizations can utilize Redshift and the AWS IaaS to develop robust, fast, and scalable customer intelligence solutions that can turn companies from data-rich to data-driven. Anyone who is considering moving their “BI stack” to Amazon or developing a prototype for their organization should attend this session.

Speakers
avatar for Jakub Jez

Jakub Jez

CEO, Centriam


Tuesday June 7, 2016 2:00pm - 2:45pm
P0838 (1st floor) 9700 France Ave S, Bloomington, MN 55431

2:00pm

Fighting Crime with SPSS Modeler and the Weather Channel Data
Speakers
avatar for Eric Lowry

Eric Lowry

Open Source Analytics Technical Leader, IBM
avatar for Maciej Lazarewicz, PhD

Maciej Lazarewicz, PhD

Senior Data Science Manager, Medtronic


Tuesday June 7, 2016 2:00pm - 2:45pm
P0806 (1st floor) 9700 France Ave S, Bloomington, MN 55431

2:00pm

Using Streaming Technology to Enable Real-time Applications
This discussion and demonstration will focus on leveraging MapR’s global event streaming technology to build real-time applications. MapR Streams will be highlighted as a key technology in enabling deployment of these types of applications globally, in real-time, and at IOT scale.

Speakers
avatar for Marc Fabbo

Marc Fabbo

Senior Systems Engineer, MapR
Marc Fabbo is currently a Sr. Systems Engineer with MapR where he helps customers learn about Big Data and how they can harness the power of it for their businesses. He came to this role after an 18 year career with Teradata where the majority of his focus was helping customers realize... Read More →


Tuesday June 7, 2016 2:00pm - 2:45pm
P0808 (1st floor) 9700 France Ave S, Bloomington, MN 55431

2:00pm

How AoT is Monetizing IoT
The IoT is dominating the headlines lately and creating a lot of hype. But how are leading companies using AoT to create new business opportunities? Come and see some practical examples of AoT and the art of the possible in some real world use cases impacting: energy/utilities, manufacturing, healthcare, food safety/supply chain, the sentient financial website, and many others. We’ll get into how new tools are enabling this next wave of innovation.

Speakers
avatar for John Thuma

John Thuma

Director Aster Analytic Strategy, Teradata Big Data Practice


Tuesday June 7, 2016 2:00pm - 2:45pm
Garden Room (1st floor) 9700 France Ave S, Bloomington, MN 55431

2:00pm

Introduction to Scala and Spark
This presentation describes the Scala programming language and its position in the language space. Learn about the Apache Spark programming model and its role in the big data space, and discuss the Scala features that make it the first choice for Spark programming. This presentation will briefly comment on the Python and Java alternatives and will also cover some basic programming tools helpful for doing Scala-based Spark development. Finally, Brad will discuss his experience teaching these technologies in a graduate software engineering course.

Speakers
avatar for Brad Rubin

Brad Rubin

Professor, Director, Center of Excellence for Big Data, St Thomas
Brad Rubin is an Associate Professor at the University of St. Thomas in St. Paul in the Graduate Programs in Software department where he teaches Big Data Architecture, Software Analysis and Design, Computer Security, and Advanced Computer Security. Most recently, he is pursuing a... Read More →


Tuesday June 7, 2016 2:00pm - 2:45pm
Auditorium (Fine Arts F2265) 9700 France Ave S, Bloomington, MN 55431

3:00pm

Data Drives Change. Change Drives Opportunity
The biggest change in our lives today is the rate of change, driven by data and new technology. The economic and social impacts of these changes are profound, and as data leaders and data scientists we have important roles in shaping the opportunities created by data and the consequences of our new inventions.  

As data continues to become a core asset in the economy, it is driving key trends including:
  • blending of personal and professional lives
  • convergence of virtual and physical world
  • breakdown of boundaries between traditional vertical industries
During this session we will discuss opportunities created by these trends, as well as tactical practices for data scientists to increase the impact of their work.

Speakers
avatar for Amy O'Connor

Amy O'Connor

Big Data Evangelist, Cloudera


Tuesday June 7, 2016 3:00pm - 3:30pm
Auditorium (Fine Arts F2265) 9700 France Ave S, Bloomington, MN 55431

3:00pm

Neural Networks for Newbies
Have you ever wondered how neural networks learn? How mathematics can emulate the brain? In this beginner's session, we'll start with the basic concepts of machine learning and create a simple yet effective image classifier using Google's TensorFlow platform. No experience necessary!

Speakers
avatar for Theo Kanning

Theo Kanning

Software Engineer, Target
I'm into robotics and fun IoT projects. Nothing too serious.


Tuesday June 7, 2016 3:00pm - 3:30pm
P0808 (1st floor) 9700 France Ave S, Bloomington, MN 55431

3:00pm

The Benefits of Spherical Cows (Hypothetical Modeling)
Hypothetical modeling of complex systems provides guidance and direction by helping to define investigations, streamline and refine data collection, focus data analyses and digest results. Even absurdly simple “spherical cow” modeling can provide these benefits.
However, in the rush to data, hypothetical modeling is often overlooked by data scientists. We will explore how building “spherical cow” models can be used to create simulated data sets and how to utilize those data sets to inform and better develop data-driven investigations. Our exploration will focus on an example investigation into the optimization of passengers boarding an airplane. Using Python and its familiar data science libraries, we will reduce that system to its basic components to build our model. Then we’ll use that model to probe the system and gain insights prior to any actual data collection. Also, there will be dinosaurs.

Speakers
avatar for Scott Ernst

Scott Ernst

Dir. Data Science & Engineering, When I Work


Tuesday June 7, 2016 3:00pm - 3:30pm
Garden Room (1st floor) 9700 France Ave S, Bloomington, MN 55431

3:00pm

Ensembles and Hyperparameter Optimizaton: Letting the Machines do the Learning

Ensemble Learning has garnered a lot of attention in recent years. Instead of searching for the elusive “perfect” model, data scientists have gravitated towards building a number of diverse “acceptable” models that collectively outperform the best of these singular models. However, creating robust ensembles often borders on wizardry as we need to optimize dozens of tuning parameters for each model over high dimensional search spaces. We’ll discuss popular Ensembling techniques that are widely popular and several tricks that practitioners employ in selecting  parameters to improve generalizability over a single model. 



Speakers
RS

Ravi Shanbhag

Sr. Director, Data Science, UnitedHealth Group


Tuesday June 7, 2016 3:00pm - 3:30pm
P0806 (1st floor) 9700 France Ave S, Bloomington, MN 55431

3:00pm

Python Training -- Open Data Science with Anaconda: A Dozen of the Top Python Libraries in 90 Minutes
Bring your laptop and make sure you download Anaconda before arriving: https://www.continuum.io/downloads! For any trainees that have not pre-downloaded Anaconda, Ian will provide the above link at the beginning of the session.

Speakers
avatar for Ian Stokes-Rees

Ian Stokes-Rees

Computational Scientist, Continuum Analytics
Ian is a computational scientist at Continuum Analytics, has been with the company since the beginning and is the team lead for the Collaboration component of Anaconda Enterprise. One of Ian’s key focus areas include working with Continuum clients to leverage Open Data Science to... Read More →


Tuesday June 7, 2016 3:00pm - 4:30pm
P1808 (2nd floor) 9700 France Ave S, Bloomington, MN 55431

3:45pm

Counting People With a Camera and Computer Number Tables: A Talk Using Only the 1000 Most Simple Words In Our Language
Speakers
avatar for Pat Delaney

Pat Delaney

Founder, IoTFuse


Tuesday June 7, 2016 3:45pm - 4:30pm
P0806 (1st floor) 9700 France Ave S, Bloomington, MN 55431

3:45pm

Open Lightning Talks
Tuesday June 7, 2016 3:45pm - 4:30pm
Garden Room (1st floor) 9700 France Ave S, Bloomington, MN 55431

3:45pm

Using Our Brain to Understand Learning
There are important clues from neuroarchitecture that neuroscientists have already discovered which provide insight into how we learn. Identifying these clues and successfully implementing them to enhance machine learning has been my personal effort for many years. The process implemented by BoonLogic for machine learning engages these clues to enable a radically different approach from the more traditional, computationally intense methods widely utilized today. This presentation will describe the simplicity, exceptional performance, and expanded capabilities of our implementation through the Pattern Memory Engine (PME).

Speakers
avatar for Grant Goris

Grant Goris

COO, Boon Logic, Inc.
Real-time UNSUPERVISED Machine Learning on Edge and remote devices.


Tuesday June 7, 2016 3:45pm - 4:30pm
Auditorium (Fine Arts F2265) 9700 France Ave S, Bloomington, MN 55431

4:30pm

Networking Social
Engage with your peers and enjoy Surly beers!

Tuesday June 7, 2016 4:30pm - 5:00pm
TBA