name: inverse layout: true class: center, middle, inverse --- # Java 8 FP Principles An introduction to Functional Programming (FP) principles in Java 8 ??? * This talk will briefly discuss some basic functional programming concepts and the new streaming API in Java 8 --- layout: false class: large-points .left-column[ ## Function First-Class ] .right-column[ ###In FP, functions are a first-class construct just like an object, variable, value or entity - Anonymous functions can be declared at run time - Functions can be passed in as arguments to other functions - Functions can return other functions! - Functions can be nested within one another ] ??? * A programming language is said to have first-class functions if functions can be treated like any other object, variable, value, or entity. * Anonymous functions are created at run time and are often referred to as Lambda's within programming languages * When functions are first-class they can be passed into other functions as arguments or as a return value. * When a function returns another function it's referred to as currying, which is outside the scope of this talk. * Functions can be nested within other functions to take advantage of lexical scoping of values of their enclosing parent function. * An analogy is how class instance method has access to its classes instance members --- class: large-points .left-column[ ## Function First-Class ## Immutability & State ] .right-column[ ###Immutable values are preferred to mutable variables in **pure** FP * AKA constants * Benefits: inherently threadsafe, readability, easier to reason about, security * Pure functions always return the same value given the same arguments * Side effects * Variables in Java are mutable by default ] ??? * Immutable values can be thought of as constants in typical imperative languages * Benefits * When a function has immutable state you can guarantee that a function will always work without side effects in multi-threaded scenarios * For the same reason it makes multithreading safe it also makes readability and reasoning easier because you can be certain no other code or threads can change its value while it's being executed * Security, same reason as above. * In Pure functional programming languages all values are immutable and its not possible to change an existing values state * Pure functions return the same value every time given the same input arguments. * Pure functions are defined as functions that have no side effects * They're like mathematical functions in that they always return the same value given the same arguments * A side effect occurs when a function modifies some state that is observable outside of the function. * For example: changing a global variable, a static variable, throwing an exception, or modifying input arguments * Limiting side effects reduces the possibility of unforeseen consequences to the rest of the system * Variables in Java are mutable by default. * Not all FP orientated languages are strict about immutability * Some languages, like Scala, adopt a hybrid approach supporting both mutable variables and immutable values --- class: large-points .left-column[ ## Parallelism ] .right-column[ ###Since pure functions are inherently thread-safe, FP implementations can easily be run in parallel. * Race conditions * Divide & conquer * Parallel processing * Distributed computing
] ??? * A race condition occurs when the output changes based on uncontrollable external events. * In FP, race conditions are caused by multithreaded applications reading and **writing** the same mutable state * Some of the most notoriously hard to find concurrency bugs are ultimately caused by mutable state shared between threads * Divide & conquer: Parallel processing is akin to the "divide and conquer" * Parallel processing is the act of running a program on more than one processor core. * This can enormously increase the speed of computational intensive operations * Input is split up into atomic chunks that can be processed separately in different threads on different processor cores. * When you utilize FP concepts like immutable state and first-class functions you can safely run functions in parallel with no side effects * Distributed computing is the same as parallel processing, but zoomed out to a cluster of computers * Inputs are split up and distributed to different computers over the network * Each computer contain worker threads that further split up the input further * Each worker thread runs your function over the data * Apache Hadoop MapReduce and Apache Spark are examples of applications that allow you to massively parallelize work over commodity hardware. * Performance can be increased by scaling out by adding more servers to a cluster * Scaling out a cluster is more cost effective than scaling up individual servers * *see next slide for hadoop cluster picture* --- class: large-points .left-column[ ##Parallelism ##Ex) Hadoop ] .right-column[ ### Apache Hadoop YARN cluster
http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/YARN.html ] --- # What is a Lambda? λ Lambda's are a new syntax in Java 8 to create an anonymous function ```java () -> println("Hello from lambda-land!"); ``` They're comprised of arguments `()`, the "arrow" notation `->`, and a function body. Lambda's return a `Runnable` type. ```java // Anonymous Runnable Runnable r1 = new Runnable(){ @Override public void run(){ System.out.println("Hello world one!"); } }; // Lambda Runnable Runnable r2 = () -> System.out.println("Hello world two!"); ``` ??? * Java 8 lambda's are syntactic sugar that allow you to replace verbose anonymous inner classes * Composition of a lambda * They're comprised of optional arguments `()`, the "arrow" notation `->`, and a function body. * Function body can optionally include braces if it's only one line. * The function body can optionally return a value or nothing (void) * Runnable type * Other common use cases for lambda's: async/Futures, callbacks --- # Java 8 Streams API ###A stream is *"a sequence of elements from a source that supports aggregate operations.”* * A sequence of elements where each element is computed on demand * Source: Streams consume from a data-providing source such as collections, arrays, or I/O resources. * Aggregate operations such as filter, map, reduce, find, match, sorted, and so on. ### Pipelining * Stream/aggregate operations return streams themselves (method chaining) ??? * Sequence of elements: A stream provides an interface to a sequenced set of values of a specific element type. However, streams don’t actually store elements; they are computed on demand. * Source: Streams consume from a data-providing source such as collections, arrays, or I/O resources. * Aggregate operations: Streams support SQL-like operations and common operations from functional programing languages, such as filter, map, reduce, find, match, sorted (comparators), and so on. * Pipelining: Stream operations return other streams using method chaining * Operations can be added together to build larger pipelines --- # Stream API Examples ## Predicate A predicate is a special kind of function that returns a boolean based on an input value. A lambda that returns a boolean can be cast to the `Predicate
` interface. ```java Predicate
evenNumPredicate = num -> num % 2 == 0; ``` There are several Streams API operations that use predicates: `allMatch`, `anyMatch`, `filter`. Ex) Assert that all numbers in a given list are less than 11 ```java List
numbers = Arrays.asList(1, 2, 3, 4, 5, 6, 7, 8, 9, 10); boolean allNumbersLessThanEleven = numbers.stream() .allMatch(number -> number < 11); assertEquals(allNumbersLessThanEleven, true); ``` ??? * A predicate is a special kind of function which returns a boolean value given an input argument * A lambda can be associated to the Predicate type when it returns a boolean. * Predicates are useful when truth values over an entire stream, * for example to see if all elements in a stream pass a condition (allmatch), if they do then return true * anyMatch will return true if any element is true given a Predicate * filter returns a new stream of elements of the same type that satisfy a given Predicate --- # Stream Pipelining Example Pipelining is a common action for transforming a dataset using filters, map, and reduce/aggregation operations. Ex) Limit first 2 squares of even numbers. ```java List
numbers = Arrays.asList(1, 2, 3, 4, 5, 6, 7, 8); List
twoEvenSquares = numbers.stream() // .parallelStream .filter(n -> n % 2 == 0) // predicate .map(n -> n * n) // transform .limit(2) .collect(Collectors.toList()); assertEquals(twoEvenSquares.size(), 2); assertArrayEquals(new Integer[]{4, 16}, twoEvenSquares.toArray()); ``` --- # Without Pipelining An equivalent implementation without the Stream API ```java List
numbers = Arrays.asList(1, 2, 3, 4, 5, 6, 7, 8); List
twoEvenSquares = new ArrayList
(); for(int i = 0; i < numbers.size(); i++) { Integer num = numbers.get(i); if (num % 2 != 0) continue; twoEvenSquares.add(num*num); if (twoEvenSquares.size() == 2) break; } assertEquals(twoEvenSquares.size(), 2); assertArrayEquals(new Integer[]{4, 16}, twoEvenSquares.toArray()); ``` --- # References Wikipedia pages * [Functional Programming](http://en.wikipedia.org/wiki/Functional_programming), [First-class functions](http://en.wikipedia.org/wiki/First-class_function), [Side effects](http://en.wikipedia.org/wiki/Side_effect_%28computer_science%29), [Immutable Object](http://en.wikipedia.org/wiki/Immutable_object), [Race Condition](http://en.wikipedia.org/wiki/Race_condition), [Parallel Processing](http://en.wikipedia.org/wiki/Parallel_processing) Oracle Docs * **[Processing Data with Java SE 8 Streams](http://www.oracle.com/technetwork/articles/java/ma14-java-se-8-streams-2177646.html)** * [Oracle Java 8 SE Lambda Quick Start](http://www.oracle.com/webfolder/technetwork/tutorials/obe/java/Lambda-QuickStart/index.html) * [Oracle Lambda expressions tutorial](http://docs.oracle.com/javase/tutorial/java/javaOO/lambdaexpressions.html) * [Parallelism](http://docs.oracle.com/javase/tutorial/collections/streams/parallelism.html) Courses * [Functional Programming Principles in Scala](https://www.coursera.org/course/progfun) Misc. * [Apache Hadoop NextGen MapReduce (YARN)](http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/YARN.html)