Slow startup for slaves with much memory

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Slow startup for slaves with much memory

Arvid Heise
Hi Flinker,

no question - just brief feedback.

If you have lots of memory available (TM_HEAP=.7TB), the startup time of slaves can be quite cumbersome.

1) -Xms parameter in the taskmanager.sh takes a long time to initialize the JVM. No log file is avail in that time and .out is empty. It took me quite a while to figure out what was wrong (in fact nothing, but it looked as if the taskmanagers crashed).

2) After removing that parameter, a second bottleneck occurs. The memory manager initializes the memory upfront. Therefore, the task manager is still not registered and tasks cannot be scheduled. (it takes up to 30 minutes for .5TB to be initalized)

I'm now running the experiments with 10% of the available RAM. Not an ideal solution, however, my environment is somewhat special.

Best,

Arvid
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Slow startup for slaves with much memory

Stephan Ewen
We are doing lazy memory initialization in the next versions. Nevertheless, it seems a bit hard that the JVM takes 30 minutes just to gather 700 GB of byte arrays.

Can you make sure that Xms and Xmx are the same? Otherwise, the heap space grows incrementally with tenured garbage collections, which takes longer.


On Wed, Aug 13, 2014 at 2:36 PM, Arvid Heise <[hidden email]> wrote:
Hi Flinker,

no question - just brief feedback.

If you have lots of memory available (TM_HEAP=.7TB), the startup time of slaves can be quite cumbersome.

1) -Xms parameter in the taskmanager.sh takes a long time to initialize the JVM. No log file is avail in that time and .out is empty. It took me quite a while to figure out what was wrong (in fact nothing, but it looked as if the taskmanagers crashed).

2) After removing that parameter, a second bottleneck occurs. The memory manager initializes the memory upfront. Therefore, the task manager is still not registered and tasks cannot be scheduled. (it takes up to 30 minutes for .5TB to be initalized)

I'm now running the experiments with 10% of the available RAM. Not an ideal solution, however, my environment is somewhat special.

Best,

Arvid

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Slow startup for slaves with much memory

Arvid Heise
You are right with Xms, it's going much faster, but the startup of the JVM is slower (up to 5 min).

Still to speed up my cold cache experiments, I'm now using way less memory for memory manager.


On Wed, Aug 13, 2014 at 4:30 PM, Stephan Ewen <[hidden email]> wrote:
We are doing lazy memory initialization in the next versions. Nevertheless, it seems a bit hard that the JVM takes 30 minutes just to gather 700 GB of byte arrays.

Can you make sure that Xms and Xmx are the same? Otherwise, the heap space grows incrementally with tenured garbage collections, which takes longer.


On Wed, Aug 13, 2014 at 2:36 PM, Arvid Heise <[hidden email]> wrote:
Hi Flinker,

no question - just brief feedback.

If you have lots of memory available (TM_HEAP=.7TB), the startup time of slaves can be quite cumbersome.

1) -Xms parameter in the taskmanager.sh takes a long time to initialize the JVM. No log file is avail in that time and .out is empty. It took me quite a while to figure out what was wrong (in fact nothing, but it looked as if the taskmanagers crashed).

2) After removing that parameter, a second bottleneck occurs. The memory manager initializes the memory upfront. Therefore, the task manager is still not registered and tasks cannot be scheduled. (it takes up to 30 minutes for .5TB to be initalized)

I'm now running the experiments with 10% of the available RAM. Not an ideal solution, however, my environment is somewhat special.

Best,

Arvid


Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Slow startup for slaves with much memory

Stephan Ewen
If the JVM startup itself is that slow, there may be an issue that the JVM is not optimized for several hundreds of megabytes of memory.

In the long run, we may think about having multiple smaller JVMs per node.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Slow startup for slaves with much memory

Sebastian Schelter

You should also check whether swapping happens somewhere.

-s

Am 13.08.2014 11:27 schrieb "Stephan Ewen" <[hidden email]>:
If the JVM startup itself is that slow, there may be an issue that the JVM is not optimized for several hundreds of megabytes of memory.

In the long run, we may think about having multiple smaller JVMs per node.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Slow startup for slaves with much memory

Stephan Ewen
Just noticed: The "several hundred MEGAbytes" in my previous mail were supposed to be "several hundred GIGAbytes" ;-)

Stephan



On Wed, Aug 13, 2014 at 9:27 PM, Sebastian Schelter <[hidden email]> wrote:

You should also check whether swapping happens somewhere.

-s

Am 13.08.2014 11:27 schrieb "Stephan Ewen" <[hidden email]>:

If the JVM startup itself is that slow, there may be an issue that the JVM is not optimized for several hundreds of megabytes of memory.

In the long run, we may think about having multiple smaller JVMs per node.

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Slow startup for slaves with much memory

Arvid Heise
Swapping is not involved, as there is no hard disk ;)


2014-08-13 21:29 GMT+02:00 Stephan Ewen <[hidden email]>:
Just noticed: The "several hundred MEGAbytes" in my previous mail were supposed to be "several hundred GIGAbytes" ;-)

Stephan



On Wed, Aug 13, 2014 at 9:27 PM, Sebastian Schelter <[hidden email]> wrote:

You should also check whether swapping happens somewhere.

-s

Am 13.08.2014 11:27 schrieb "Stephan Ewen" <[hidden email]>:

If the JVM startup itself is that slow, there may be an issue that the JVM is not optimized for several hundreds of megabytes of memory.

In the long run, we may think about having multiple smaller JVMs per node.


Loading...