Stream API `groupingBy` examples

Published on August 20, 2024

Exploring `Collectors.groupingBy` in Java Stream API

The Java Stream API provides powerful tools to process and manipulate data collections in a functional programming style. One of these tools is Collectors.groupingBy, which groups data by a specified key and then applies a downstream operation. This blog post explores various examples of using Collectors.groupingBy with different scenarios, from simple grouping to more complex operations like counting, filtering, and mapping.

1. Grouping by a Simple Condition

The most basic use of groupingBy is to classify elements based on a specified key. For example, we can group Person objects by their Department:

Map<Department, List<Person>> map1 = persons.stream()
    .collect(Collectors.groupingBy(Person::department));
System.out.println("Grouping all persons by department");
System.out.println(map1);

This code snippet groups all persons into lists, each corresponding to a department.

1.1. Grouping by a Condition and Extracting Specific Data

Sometimes, you may want to group elements but only keep specific attributes. The following example groups persons by department but collects only their IDs:

Map<Department, List<Integer>> map2 = persons.stream()
    .collect(Collectors.groupingBy(Person::department, Collectors.mapping(Person::id, Collectors.toList())));
System.out.println("Grouping all persons' IDs by department");
System.out.println(map2);

2. Grouping with Counting

Collectors.groupingBy can be combined with other collectors like counting() to perform aggregation operations on grouped data:

Map<Department, Long> map3 = persons.stream()
    .collect(Collectors.groupingBy(Person::department, Collectors.counting()));
System.out.println("Counting all persons in a department.");
System.out.println(map3);

This example counts the number of persons in each department.

3. Grouping by a Different Attribute

We can also group persons by their salary and map their names:

Map<Double, List<String>> map4 = persons.stream()
    .collect(Collectors.groupingBy(Person::salary, Collectors.mapping(Person::name, Collectors.toList())));
System.out.println("Grouping persons by the same salary.");
System.out.println(map4);

4. Summing Values in Each Group

groupingBy can be used with summingDouble to calculate the sum of a particular attribute, like salary, for each group:

Map<String, Double> map5 = persons.stream()
    .collect(Collectors.groupingBy(person -> person.department().name(), Collectors.summingDouble(Person::salary)));
System.out.println("Summing salaries by department name");
System.out.println(map5);

This code sums the salaries of all persons within each department.

5. Finding the Maximum Value in Each Group

You can use maxBy to find the person with the highest salary in each department:

Map<String, Optional<Person>> map7 = persons.stream()
    .collect(Collectors.groupingBy(person -> person.department().name(),
            Collectors.maxBy(Comparator.comparingDouble(Person::salary))));
System.out.println("Max salaried person in each department");
System.out.println(map7);

For a more readable output (extracting only the name), you can use collectingAndThen:

Map<String, String> maxSalariedPersonInEachDepartment = persons.stream()
    .collect(Collectors.groupingBy(
        person -> person.department().name(),
        Collectors.collectingAndThen(
            Collectors.maxBy(Comparator.comparingDouble(Person::salary)),
            optionalPerson -> optionalPerson.map(Person::name).orElse(null)
        )
    ));
System.out.println("Max salaried person in each department");
System.out.println(maxSalariedPersonInEachDepartment);

6. Filtering Before Grouping

You might want to filter out certain elements before grouping them. For instance, filtering persons with a salary greater than 300 and then grouping them by department:

Map<String, List<String>> map8 = persons.stream()
    .filter(person -> person.salary() > 300d)
    .collect(Collectors.groupingBy(
        person -> person.department().name(),
        Collectors.mapping(Person::name, Collectors.toList())
    ));
System.out.println("Filtering persons with salary greater than 300");
System.out.println(map8);

7. Counting After Filtering

If you want to count elements after applying a filter, groupingBy combined with filtering and counting is useful:

Map<String, Long> map9 = persons.stream()
    .collect(Collectors.groupingBy(
        person -> person.department().name(),
        Collectors.filtering(person -> person.salary() > 300d, Collectors.counting())
    ));
System.out.println("Counting persons with salary greater than 300 after filtering");
System.out.println(map9);

This approach ensures that all departments are listed, even if no persons meet the filtering criteria.

Full Code:

package ch.souradip.streamapi;
 
import java.util.Comparator;
import java.util.List;
import java.util.Map;
import java.util.Optional;
import java.util.stream.Collectors;
 
record Person(int id, String name, double salary, Department department) {}
record Department(int id, String name) {}
 
public class GroupingByExamples {
    public static void main(String[] args) {
        /*
        * Grouping By a Simple Condition
        * groupingByConcurrent(classifier)
        * groupingByConcurrent(classifier, collector)
        * groupingByConcurrent(classifier, supplier, collector)
        * */
 
        /*
        * classifier: maps input elements to map keys
        * collector: is the downstream reduction function.
        *            By default, Collectors.toList() is used which causes the grouped elements into a List.
        * supplier: provides a new empty Map into which the results will be inserted. By default, HashMap::new is used.
        *           We can use other maps such as TreeMap, LinkedHashMap or ConcurrentMap to insert additional behavior
        *           in the grouping process such as sorting.
        * */
 
        /*
        * We can use Collectors.groupingByConcurrent() if we wish to process the stream elements parallelly
        * that uses the multi-core architecture of the machine and returns a ConcurrentMap.
        * Except for concurrency, it works similarly to groupingBy() method.
        * */
 
 
        List<Person> persons = List.of(
                new Person(1, "Alex", 100d, new Department(1, "HR")),
                new Person(2, "Brian", 200d, new Department(1, "HR")),
                new Person(3, "Charles", 900d, new Department(2, "Finance")),
                new Person(4, "David", 200d, new Department(2, "Finance")),
                new Person(5, "Edward", 200d, new Department(2, "Finance")),
                new Person(6, "Frank", 800d, new Department(3, "ADMIN")),
                new Person(7, "George", 900d, new Department(3, "ADMIN")));
 
        // 1. Grouping By a Simple Condition - Grouping all persons by department
        Map<Department, List<Person>> map1 = persons.stream()
                .collect(Collectors.groupingBy(Person::department));
        System.out.println("Grouping all persons by department");
        System.out.println(map1);
 
        // 1.1. Grouping By a Simple Condition - collect only the person ids in all departments
        Map<Department, List<Integer>> map2 = persons.stream()
                .collect(Collectors.groupingBy(Person::department, Collectors.mapping(Person::id, Collectors.toList())));
        System.out.println("Grouping all persons ids only by department");
        System.out.println(map2);
 
        // 2. Grouping with Counting - We can also aggregate the values by performing other operations such as
        // counting(), averaging() summing() etc.
        // This helps in getting the reduction operation on Map values to produce a single value.
        Map<Department, Long> map3 = persons.stream()
                .collect(Collectors.groupingBy(Person::department, Collectors.counting()));
        System.out.println("Counting all the persons in a department.");
        System.out.println(map3);
 
        //the persons having the same salary.
        Map<Double, List<String>> map4 = persons.stream()
                .collect(Collectors.groupingBy((person) -> person.salary(), Collectors.mapping(Person::name, Collectors.toList())));
        System.out.println("persons having the same salary.");
        System.out.println(map4);
 
        // Department name wise summing salary
        Map<String, Double> map5 = persons.stream()
                .collect(Collectors.groupingBy((person) -> person.department().name(), Collectors.summingDouble(Person::salary)));
        System.out.println("Department name wise summing salary");
        System.out.println(map5);
 
        // Department name wise averaging salary
        Map<String, Double> map6 = persons.stream()
                .collect(Collectors.groupingBy((person) -> person.department().name(), Collectors.averagingDouble(Person::salary)));
        System.out.println("Department name wise summing salary");
        System.out.println(map6);
 
        // Max salaried person in each department
        Map<String, Optional<Person>> map7 = persons.stream()
                .collect(Collectors.groupingBy(person -> person.department().name() ,
                        Collectors.maxBy(Comparator.comparingDouble(Person::salary)))
                        );
        System.out.println("Max salaried person in each department");
        System.out.println(map7);
 
//        Map<String, String> maxSalariedPersonInEachDepartment = persons.stream()
//                .collect(Collectors.groupingBy(
//                        person -> person.department().name(),
//                        Collectors.collectingAndThen(
//                                Collectors.maxBy(Comparator.comparingDouble(person -> person.salary())),
//                                optionalPerson -> optionalPerson.map(person -> person.name()).orElse(null)
//                        )
//                ));
//
//        System.out.println("Max salaried person in each department");
//        System.out.println(maxSalariedPersonInEachDepartment);
 
        // Filtering all persons with salary less than 300
        Map<String, List<String>> map8 = persons.stream()
                .filter(person -> person.salary() > 300d)
                .collect(Collectors.groupingBy(
                        person -> person.department().name(),
                        Collectors.mapping(person -> person.name(), Collectors.toList())
                ));
        System.out.println("Filtering all persons with salary less than 300");
        System.out.println(map8);
 
        // Filtering & counting all persons with salary less than 300 -
        Map<String, Long> map9 = persons.stream()
                .filter(person -> person.salary() > 300d)
                .collect(Collectors.groupingBy(
                        person -> person.department().name(),
                        Collectors.counting()
                ));
        System.out.println("Filtering and counting all persons with salary less than 300");
        System.out.println(map9); //{Finance=1, ADMIN=2}
 
//        The above program output omits the department-1 altogether because there was no person matching
//        the condition in that department. But if we want to list all such Map keys where there is no matching value
//        exists then we can use Collectors.filtering() method that applies the filter while adding values in to Map
        Map<String, Long> map10 = persons.stream()
                .collect(Collectors.groupingBy(
                        person -> person.department().name(),
                        Collectors.filtering(person->person.salary() > 300d,Collectors.counting())
                ));
        System.out.println("Filtering and counting all persons with salary less than 300");
        System.out.println(map10);//{Finance=1, HR=0, ADMIN=2}
 
 
    }
}

Output

Grouping all persons by department
{Department[id=2, name=Finance]=[Person[id=3, name=Charles, salary=900.0, department=Department[id=2, name=Finance]], Person[id=4, name=David, salary=200.0, department=Department[id=2, name=Finance]], Person[id=5, name=Edward, salary=200.0, department=Department[id=2, name=Finance]]], Department[id=3, name=ADMIN]=[Person[id=6, name=Frank, salary=800.0, department=Department[id=3, name=ADMIN]], Person[id=7, name=George, salary=900.0, department=Department[id=3, name=ADMIN]]], Department[id=1, name=HR]=[Person[id=1, name=Alex, salary=100.0, department=Department[id=1, name=HR]], Person[id=2, name=Brian, salary=200.0, department=Department[id=1, name=HR]]]}
Grouping all persons ids only by department
{Department[id=2, name=Finance]=[3, 4, 5], Department[id=3, name=ADMIN]=[6, 7], Department[id=1, name=HR]=[1, 2]}
Counting all the persons in a department.
{Department[id=2, name=Finance]=3, Department[id=3, name=ADMIN]=2, Department[id=1, name=HR]=2}
persons having the same salary.
{800.0=[Frank], 200.0=[Brian, David, Edward], 100.0=[Alex], 900.0=[Charles, George]}
Department name wise summing salary
{Finance=1300.0, HR=300.0, ADMIN=1700.0}
Department name wise summing salary
{Finance=433.3333333333333, HR=150.0, ADMIN=850.0}
Max salaried person in each department
{Finance=Optional[Person[id=3, name=Charles, salary=900.0, department=Department[id=2, name=Finance]]], HR=Optional[Person[id=2, name=Brian, salary=200.0, department=Department[id=1, name=HR]]], ADMIN=Optional[Person[id=7, name=George, salary=900.0, department=Department[id=3, name=ADMIN]]]}
Filtering all persons with salary less than 300
{Finance=[Charles], ADMIN=[Frank, George]}
Filtering and counting all persons with salary less than 300
{Finance=1, ADMIN=2}
Filtering and counting all persons with salary less than 300
{Finance=1, HR=0, ADMIN=2}

Exploring Collectors.groupingBy in Java Stream API