Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How can I avoid AbstractQueuedSynchronizer$Node and FiberTimedScheduler$ScheduledFutureTask? #224

Open
Jire opened this issue Sep 21, 2016 · 10 comments
Assignees

Comments

@Jire
Copy link

Jire commented Sep 21, 2016

How can I avoid creating these objects? Why are these objects created? They seem to be retained and won't be garbage collected, climbing to 50MB+ of memory usage.

Here's a screenshot of YourKit after a few minutes of my application running: http://i.imgur.com/ppMFmyU.png

@pron
Copy link
Contributor

pron commented Sep 22, 2016

j.u.c.AbstractQueuedSynchronizer has nothing to do with Quasar, but the fact you're using it may suggest you may be accidentally using some thread synchronization that blocks fibers, which may also explain when FiberTimedScheduler$ScheduledFutureTasks keep piling up.

@Jire
Copy link
Author

Jire commented Sep 29, 2016

@pron Our app does not use any true blocking as in synchronizations, but it does constantly do a lot of work across many fibers at once.

Some points of concern:

  • We set the threads for Quasar to 1
  • We also use thread local
  • We liberally use Kotlin's lazy delegates (with NONE synchronization)
  • In some of our fibers we are doing work around the clock, with under 2ms pauses
  • Tons of usage of Strand.sleep

What do you think?

@Jire Jire closed this as completed Sep 29, 2016
@Jire Jire reopened this Sep 29, 2016
@pron
Copy link
Contributor

pron commented Oct 1, 2016

I think you'll need to investigate where those j.u.c.AbstractQueuedSynchronizers are created with a debugger.

@pron
Copy link
Contributor

pron commented Nov 7, 2016

Hi @Jire . Any news on this?

@pron pron self-assigned this Nov 7, 2016
@jonatino
Copy link

jonatino commented Dec 30, 2016

@pron We've both tried to take a look at this and still have not had any luck. We call Strand.sleep a lot which seems to be responsible for the abundant amount of ScheduledFutureTask instances (https://dl.dropboxusercontent.com/u/91292881/ShareX/2016/12/javaw_2016-12-30_03-57-34.png)

    public Future<Void> schedule(Fiber<?> fiber, Object blocker, long delay, TimeUnit unit) {
        if (fiber == null || unit == null)
            throw new NullPointerException();
        assert fiber.getScheduler() == scheduler;
        ScheduledFutureTask t = new ScheduledFutureTask(fiber, blocker, triggerTime(delay, unit));
        delayedExecute(t);
        return t;
    }

In regards to the AbstractQueuedSyncronizer, could it also be caused by the amount of Strand.sleeps we use?

Here is a small example which will reproduce the same results after 5-10 minutes of running.

import co.paralleluniverse.kotlin.fiber
import co.paralleluniverse.strands.Strand
import java.util.concurrent.ThreadLocalRandom
import java.util.concurrent.TimeUnit

fun main(args: Array<String>) {
	System.setProperty("co.paralleluniverse.fibers.detectRunawayFibers", "false")
	System.setProperty("co.paralleluniverse.fibers.verifyInstrumentation", "false")
	System.setProperty("co.paralleluniverse.fibers.DefaultFiberPool.parallelism", "1")
	
	every(8) {
		val r = nextInt(0,40)
		if (r == 10) {
			Strand.sleep((20 + nextInt(0, 200)).toLong())
		}
	}
	
	Strand.sleep(Long.MAX_VALUE) // prevent exit
}

inline fun every(duration: Int, crossinline body: () -> Unit) = fiber {
	while (!Strand.interrupted()) {
		body()
		Strand.sleep(duration.toLong(), TimeUnit.MILLISECONDS)
	}
}

fun nextInt(min:Int, max:Int) = ThreadLocalRandom.current().nextInt(min,max)

Screenshot:
Screenshot

Screenshot

This is just one example of a typical Strand we have in our project. When you multiply those results by 10 or 20 you can see the issue we are having 😄

@pron
Copy link
Contributor

pron commented Dec 30, 2016

Can you try running with -Dco.paralleluniverse.fibers.useLockFreeDelayQueue or -Dco.paralleluniverse.fibers.useLockFreeDelayQueue=true?

The ScheduledFutureTask will still be allocated, but I believe it will stop allocating j.u.c.AbstractQueuedSynchronizer$Nodes.

@jonatino
Copy link

jonatino commented Dec 30, 2016

That did get rid of the j..u.c.AbstractQueuedSynchronizer$Node but started allocating co.paralleluniverse.concurrent.util.ConcurrentSkipListPriorityQueue$Nodes and co.paralleluniverse.concurrent.util.ConcurrentSkipListPriorityQueue$Indexs

Screen

Scr

In case you need YourKit, they would be more than happy to offer you an opensource license. Their only requirement is a mention somewhere on the project (for example ours it at the end of our README https://github.com/Jire/Acelta). https://www.yourkit.com/purchase/#os_license

@pron
Copy link
Contributor

pron commented Dec 30, 2016

Well, at least I figured out where those nodes were coming from.

Now, whenever a thread (or a fiber -- same thing) blocks, something must be allocated, whether it's a node in a waiter's list on a lock, or, as in this case, some node in a scheduled waiting list. There's no getting around that. I suppose we could use a different data structure, like an array list, to hold the records -- which would be very unorthodox -- but as the list must be sorted, that would mean constantly searching it. Anyway, in all languages and implementations I know, blocking any kind of thread entails an allocation. Those records must be maintained until the thread is unblocked. Do you have any indication that they are preserved beyond that?

@pron
Copy link
Contributor

pron commented Dec 30, 2016

Actually, after some more thought, I think we can significantly reduce that allocation, but it will take a bit of work, and I want to make sure that you are actually experiencing adverse GC effects because of this. Anyway, when I said "there's no getting around that", I was wrong. We can get around that.

@jonatino
Copy link

jonatino commented Dec 30, 2016

@pron has this issue been brought up before? Our software cycles within 0-1ms so it's crucial to have zero garbage created to prevent any additional latency from the GC cleaning up (The only garbage we have are the aforementioned two allocations from quasar). Now whether or not it's worth putting in the work to fix this, that's totally up to you. IMO, if you can think of a way to significantly reduce the allocations without a performance impact, I don't see why not. 😄

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
3 participants