Verification Guild
A Community of Verification Professionals

 Create an AccountHome | Calendar | Downloads | FAQ | Links | Site Admin | Your Account  

Login
Nickname

Password

Security Code: Security Code
Type Security Code
BACKWARD

Don't have an account yet? You can create one. As a registered user you have some advantages like theme manager, comments configuration and post comments with your name.

Modules
· Home
· Downloads
· FAQ
· Feedback
· Recommend Us
· Web Links
· Your Account

Advertising

Who's Online
There are currently, 50 guest(s) and 0 member(s) that are online.

You are Anonymous user. You can register for free by clicking here

  
Verification Guild: Forums

 Forum FAQForum FAQ   SearchSearch   UsergroupsUsergroups   ProfileProfile  ProfileDigest    Log inLog in 

Linux Farm/Load sharing software

 
This forum is locked: you cannot post, reply to, or edit topics.   This topic is locked: you cannot edit posts or make replies.    Verification Guild Forum Index -> Miscellaneous
View previous topic :: View next topic  
Author Message
Sharan
Junior
Junior


Joined: Feb 29, 2004
Posts: 5

PostPosted: Fri Dec 03, 2004 3:50 am    Post subject: Linux Farm/Load sharing software Reply with quote

Dear forum members,

Currently we make use of limited number of sun servers and desktops to run our simulations.

In order to make efficient use of available computing power, ease of scalability and to avoid manual job scheduling nightmares that engineers face, we are looking at migrating to small linux farm. We think linux is better due to the value proposition that it provides.

Assuming that the infrastructure would be used for testing ~10 million gate design, i would like to know your opinion regarding the following

1) Is linux the best way to go (vis-a-vis sun/solaris). Are there any unexpected pitfalls ?

2) Is load sharing software a must ?To put it in another way, would having a small grid without load sharing software defeats the very purpose.

3) Any load sharing softwares that you would recommend. Both commercial or otherwise.

4) Assuming 2 CPU configuration and simulator is not multithreaded, is there a marked performance degradation when 2 simulations (almost same memory images) are fired on such boxes. Though a speed of 2X cannot be expected, but speed of less than 1.5X is not good either !!

5) With load sharing softwares, would there be a radical change the way engineers use servers.
Engineers have their own preference of shells etc. Will a load sharing S/W restrict users to use
specific shells etc.

6) Do load sharing softwares allows certain users (or certain processes) to load specific group of machines.
For example, one would want to run memory hungry simulations on machines which have large memory capacity.

7) While going through product of one popular load sharing S/W, there was mention about the support for
different EDA tools (NCSIM, VCS, DC etc.)
My question is, why do these load sharing softwares need to be tool-aware ?
Back to top
View user's profile
phip
Junior
Junior


Joined: Jan 15, 2004
Posts: 6

PostPosted: Fri Dec 03, 2004 8:05 am    Post subject: re: Linux Farm/Load sharing software Reply with quote

Hi Sharan,

First, I'll respond to your individual questions, and then give some more general comments.

> 1) Is linux the best way to go (vis-a-vis sun/solaris). Are there any
> unexpected pitfalls ?

Linux is becoming the obvious choice for several reasons:
1. Performance and Price/Performance of Intel and AMD processors is increasing more rapidly than RISC/UNIX.
2. Linux is free (and libre)
3. EDA vendors are supporting Linux

> 2) Is load sharing software a must ?To put it in another way, would
> having a small grid without load sharing software defeats the
> very purpose.

You could manage a very small compute farm by checking the cpu load on all machines, and then manually submitting jobs to a machine with a low load. This approach will run out of steam very quickly as the compute farm size and utilization increases. I would definitely recommend some kind of load-sharing software.

> 6) Do load sharing softwares allows certain users (or certain
> processes) to load specific group of machines. For example, one
> would want to run memory hungry simulations on machines which
> have large memory capacity.

Most load-sharing programs support this. One approach is to specify several queues for different types of jobs. Load sharing programs usually also have the option of specifying required machine capacity (memory, speed, disk, OS version, tools/license available, etc.) with compute jobs as they are submitted.

> 7) While going through product of one popular load sharing S/W,
> there was mention about the support for different EDA tools
> (NCSIM, VCS, DC etc.) My question is, why do these load sharing
> softwares need to be tool-aware ?

The main advantage I've come across is the ability for the load-sharing program to tell a compute job (simulation, etc.) to save a checkpoint. The job can then be stopped and restarted later or on a different machine without loosing the work that has already been done. This certainly is not essential, and is probably useless on a small grid/farm that is dedicated to simulation jobs. This feature can improve grid utilization in larger grids with multiple classes of jobs, users, and priorities.

One other issue to consider is who will manage the grid - the users/engineers, or a separate IT support group. There are advantages both ways. An IT support group will probably have more experience with system administration, but they will react more slowly to changing needs (and urgent problems) from the design/verification group. They will also tend to be less motivated to make an extra effort to maximize the utilization and throughput of the grid. I think the ideal approach is to have a system administrator (or a few) that reports directly into the engineering project group, instead of reporting into a separate corporate IT structure. This seems like a good idea to me, but I have never seen it in practice.

Moving to a grid-based compute infrastructure might also be the ideal time to consider improving the regression management and result logging aspects of your verification environment. The general goal is to have a "throughput oriented" simulation approach, rather than a "job turn-around" oriented approach. This means you need to be able to quickly analyze a large number of failing simulations and decide which tests to debug individually after the fact.

I hope you find this helpful.

Cheers,
Phil
Back to top
View user's profile
richardbradley
Senior
Senior


Joined: Feb 10, 2004
Posts: 73
Location: St Louis, Mo

PostPosted: Fri Dec 03, 2004 9:17 am    Post subject: Reply with quote

2. The best way to go is to have BIG Sun/Hp machines. Like 64 processors. But since this is SO expensive, many people are moving to small farms. Having a small farm you do not NEED load sharing software. It is important when doing this to have an agreement as to what computers each team/engineer can use, and to keep talking about it. Linux boxes are so cheep, it may be wise to buy MORE computer resources then you have HDL licenses just to give people breathing room.

3. You can check out "parallel" from the download/Utilities section of the Verification guild. It is not load sharing software like what you are looking for, but it does help with the regression part of the problem when using a farm.

4. Each processor has different responses to the number of process run. The exact nature of the process run also has alot to do with it. Typically, on a 2 way Intel box you will not see much of a slowdown at all for two simulations. On the new hyperthreaded processors from Intel, I was actually running up to 4 process on a 2 way box and still seeing a positive gain on throughput!!! Each ran slower, but taken all together they still ran faster then running 4 simulations two at a time.

My suggestion is to set up a trial box that is configured like the boxes you wish to buy. Have someone convert your stuff over (Some tweeking will be needed). And do some testing as to what is the throughput running various loads.

~Rich
Back to top
View user's profile Visit poster's website
EdA
Senior
Senior


Joined: Jan 06, 2004
Posts: 64

PostPosted: Fri Dec 03, 2004 9:18 am    Post subject: Load Sharing Software Reply with quote

For commercial load sharing software: LSF (http://www.platform.com)
For open source: Gridware (http://gridengine.sunsource.net)

LSF handles integration with license managers better.

Both accomplish the same thing.

You can restrict interactive logins to your farm boxes.
Set up policies for who can submit what and when and with what resource requirements.

/Ed
Back to top
View user's profile
Sharan
Junior
Junior


Joined: Feb 29, 2004
Posts: 5

PostPosted: Tue Dec 07, 2004 2:51 am    Post subject: linux farm -- thanks Reply with quote

Hi Phip, Ed, Richard,

Thanks for your valuable inputs.

Richard: I will give a shot at "parallel" as soon as I get some time.

Regards,
Sharan
Back to top
View user's profile
Display posts from previous:   
This forum is locked: you cannot post, reply to, or edit topics.   This topic is locked: you cannot edit posts or make replies.    Verification Guild Forum Index -> Miscellaneous All times are GMT - 5 Hours
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
Verification Guild © 2006 Janick Bergeron
Web site engine's code is Copyright © 2003 by PHP-Nuke. All Rights Reserved. PHP-Nuke is Free Software released under the GNU/GPL license.
Page Generation: 0.139 Seconds