Dean Wampler

Organizer Chicago Apache Spark & Scala User Groups @ GOTO Chicago 2015

  • Dean Wampler

    Organizer Chicago Spark & Scala User Groups

    Dean Wampler is the Big Data Architect at Typesafe and specializes in the application of Functional Programming principles to “Big Data” applications, using Hadoop and alternative technologies. Dean is a contributor to several open-source projects and the founder of the Chicago-Area Scala Enthusiasts. He is the author of Functional Programming for Java Developers, the co-author of Programming Scala, and the co-author of Programming Hive, all from O’Reilly. He pontificates on twitter, @deanwampler, and at

    GOTO Chicago 2015

    Data Science at Scale with Spark

    Apache Spark has been blessed as the replacement for MapReduce in Hadoop environments. It also runs in other deployment modes. Spark provides better performance, better developer productivity, and it supports a wider range of application scenarios than MapReduce, including event stream processing, ad hoc queries, graphs, and iterative algorithms. Graphs are a natural way to represent many data sets, such as social media networks, and iterative algorithms are important for Machine Learning, such as model training with gradient descent.

    This talks discusses Spark from a Data Science perspective, it's strengths and weaknesses, the Scala, Java, Python, and R APIs it offers for common analytics problems, what's missing, and what's planned. We'll look at support for ad hoc queries over large data sets, machine learning algorithms, graph processing, the programmer experience, and the pragmatic concerns of running applications.

    Support my interviews on Patreon

    Become a Patreon Patron for early access to interviews, and other perks for as low as $1.00 per interview published.

    More information on Patreon and how to become a supporter at our Patreon homepage.