VOOZH about

URL: https://bugs.openjdk.org/browse/JDK-8327778

⇱ Loading...


ADDITIONAL SYSTEM INFORMATION :
OS: Windows 10
Java: Corretto JDK 1.8.262

A DESCRIPTION OF THE PROBLEM :
The streams fed to `Stream.flatMap(Function)` are eagerly streamed when the iterator returned from `Stream.iterator()` is first queried for elements w/ `iterator.hasNext()`. This is particularly problematic when the "source" streams will traverse millions of objects; the iterator fills the heap w/ the internal SpinedBuffer used by the flattened stream, which leads to OoM issues.

It appears this was *almost* addressed in a backport I found here: https://bugs.openjdk.org/browse/JDK-8225328. If I'm reading it right, this was patched in 1.8.222; I can see the changes when I read the source code on my machine. This was supposed to address terminal operations "cancelling" the flattened stream early, but I don't think it covered the case where an `.iterator()` call was made on the stream.

STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
Run unit tests provided in source code attribute.

EXPECTED VERSUS ACTUAL BEHAVIOR :
EXPECTED -
Tests ought to pass.
ACTUAL -
The streamTerminated test ought to pass, b/c the original largeElements stream shouldn't be used until the flatItr stream is advanced. It fails b/c hasNext() reads all elements.

Similarly, iteratorExhausted ought to pass b/c the new iterator has not yet been used; it should not have consumed any elements from largeElements. It fails b/c flatItr.hasNext() consumed all source elements into a SpinedBuffer.

---------- BEGIN SOURCE ----------
import org.junit.Assert;
import org.junit.Test;

import java.util.Arrays;
import java.util.Collection;
import java.util.Iterator;
import java.util.Spliterator;
import java.util.Spliterators;
import java.util.stream.Collectors;
import java.util.stream.Stream;
import java.util.stream.StreamSupport;

public class TestCase {
    @Test
    public void streamTerminated () {
        Stream<?> largeElements = Stream.of(1, 2, 3);
        Collection<Stream<?>> itrList = Arrays.asList(largeElements);

        Iterator<?> flatItr = itrList.stream()
            .flatMap(s -> s)
            .iterator();

        Assert.assertTrue("Flattened doesn't hasNext()", flatItr.hasNext()); // Passes
        largeElements.collect(Collectors.toList()); // Exception
    }

    @Test
    public void iteratorExhausted () {
        Iterator<?> largeElements = Arrays.asList(1, 2, 3).iterator();
        Collection<Iterator<?>> itrList = Arrays.asList(largeElements);

        Iterator<?> flatItr = itrList.stream()
            .flatMap(itr -> StreamSupport
                .stream(
                    Spliterators.spliteratorUnknownSize(
                        itr,
                        Spliterator.ORDERED),
                    false)
                .sequential())
            .iterator();

        Assert.assertTrue(
            "Original iterator exhausted early",
            largeElements.hasNext()); // Passes
        Assert.assertTrue("Flattened doesn't hasNext()", flatItr.hasNext()); // Passes
        Assert.assertTrue(
            "Original iterator exhausted after hasNext()",
            largeElements.hasNext()); // Fails
    }
}
---------- END SOURCE ----------

CUSTOMER SUBMITTED WORKAROUND :
The only workaround I have it to just *not* use Stream.iterator(). If I want an iterator of iterators (e.g. a "concatenated" iterator), I can create a special Iterator implementation that internally manages an iterator of iterators.

FREQUENCY : always


Assignee:
👁 Image
Viktor Klang
Reporter:
👁 Image
Webbug Group
Votes:
Vote for this issue
Watchers:
Start watching this issue
Created:
Updated:
Resolved: