Requesting nodes by numbers and their names in SGE - sungridengine

How to request the number of nodes (not procs), while job submission in SGE?
for e.g. In TORQUE, we can specify qsub -l nodes=3
How to request the nodes by their names in SGE?
for e.g. In TORQUE, we can do this by qsub -l nodes=abc+xyz+pqr, where abc, xyz and pqr are hostnames
For single hostname, qsub -l hostname=abc it works. But how do I delimit multiple hostnames in SGE?

Requesting the number of nodes with Grid Engine is done indirectly.
When you want to submit a parallel job then you have to request
a parallel environment (man sge_pe) together with the amount
of slots (processors etc) like qsub -pe mytestpe 12...
Depending on the allocation_rule defined in the parallel environment
(qconf -sp mytestpe) the slots are distributed over one or more
nodes. If you have a so called fixed allocation rule where you
just add a certain number as allocation rule like 4 (4 slots per
host) it is easy. If you like one host just submit with -pe mytestpe 4
if you want 10 nodes just submit with -pe mytestpe 40.
Node name can be requested by the -l h=abc. Since node names are
RESTRINGS (regular expression strings) in Grid Engine you can create
a regular expression for host filtering: qsub -l h="abc|xyz".
You can also create host groups (qconf -ahgrp) and request
so called queue domains (qsub -q all.q##mygroup).
Daniel
http://www.gridengine.eu

you can use -tc to limit the number of concurrent tasks (i.e., number of slots that will be used for an array job). I use this when I submit array jobs with 100 sub-jobs to limit the impact on our queue, defaulting to 10 simultaneous jobs with -tc 10. As each job finishes, another array job from the pending pool will be submitted.
the only way I've been able to figure out to do this would be to set up specific resource quota sets (using qconf -mrqs) specifying the particular host groups you want to use. You would have to set up all of the combinations that you want, first. I don't see a real reason to specify specific hosts, though, unless these hosts have specific resources that you want to use (in which case, I'd set up consumable resources for those and apply the appropriate number of resources to each host that can supply them, then use that instead of specifying the specific hosts for a particular job).

Related

ZooKeeper Multi-Server Setup by Example

From the ZooKeeper multi-server config docs they show the following configs that can be placed inside of zoo.cfg (ZK's config file) on each server:
tickTime=2000
dataDir=/var/zookeeper/
clientPort=2181
initLimit=5
syncLimit=2
server.1=zoo1:2888:3888
server.2=zoo2:2888:3888
server.3=zoo3:2888:3888
Furthermore, they state that you need a myid file on each ZK node whose content matches one of the server.id values above. So for example, in a 3-node "ensemble" (ZK cluster), the first node's myid file would simply contain the value 1. The second node's myid file would contain 2, and so forth.
I have a few practical questions about what this looks like in the real world:
1. Can localhost be used? If zoo.cfg has to be repeated on each node in the ensemble, is it OK to define the current server as localhost? For example, in a 3-node ensemble, would it be OK for Server #2's zoo.cfg to look like:
tickTime=2000
dataDir=/var/zookeeper/
clientPort=2181
initLimit=5
syncLimit=2
server.1=zoo1:2888:3888
server.2=localhost:2888:3888 # Afterall, we're on server #2!
server.3=zoo3:2888:3888
Or is this not advised/not possible?
2. Do they server ids have to be numerical? For instance, could I have a 5-node ensemble where each server's zoo.cfg looks like:
tickTime=2000
dataDir=/var/zookeeper/
clientPort=2181
initLimit=5
syncLimit=2
server.red=zoo1:2888:3888
server.green=zoo2:2888:3888
server.blue=zoo3:2888:3888
server.orange=zoo1:2888:3888
server.purple=zoo2:2888:3888
And, say, Server 1's myid would contain the value red inside of it (etc.)?
1. Can localhost be used?
This is a good question as ZooKeeper docs don't make it cristal clear whether the configuration file only accepts IP addresses. It says only hostname which could mean either an IP address, a DNS, or a name in the hosts file, such as localhost.
server.x=[hostname]:nnnnn[:nnnnn], etc
(No Java system property)
servers making up the ZooKeeper ensemble. When the server starts up, it determines which server it is by looking for the file myid in the data directory. That file contains the server number, in ASCII, and it should match x in server.x in the left hand side of this setting.
However, note that ZooKeeper recommend to use the exactly same configuration file in all hosts:
ZooKeeper's behavior is governed by the ZooKeeper configuration file. This file is designed so that the exact same file can be used by all the servers that make up a ZooKeeper server assuming the disk layouts are the same. If servers use different configuration files, care must be taken to ensure that the list of servers in all of the different configuration files match.
So simply put the machine IP address and everything should work. Also, I have personally tested using 0.0.0.0 (in a situation where the interface IP address was different from the public IP address) and it does work.
2. Do they server ids have to be numerical?
From ZooKeeper multi-server configuration docs, myid need to be a numerical value from 1 to 255:
The myid file consists of a single line containing only the text of that machine's id. So myid of server 1 would contain the text "1" and nothing else. The id must be unique within the ensemble and should have a value between 1 and 255.
Since myid must match the x in server.x parameter, we can infer that x must be a numerical value as well.

Google Compute Engine API - Filtering Instances or Other Lists Using Labels / Tags

With the gcloud command line tool I can do:
$ gcloud compute instances list --filter='tags.items:development'
The documentation claims: "..you can also filter on nested fields. For example, you could filter on instances that have set the scheduling.automaticRestart field to true. Use filtering on nested fields to take advantage of labels to organize and search for results based on label values." But no examples are provided, so it's not clear how one actually goes about this.
I've tried labels.development eq *.*, labels eq *development*, labels:development et al.. I've also tried setting the verbosity of the of the command line client to info and looking through the output, as well as monitoring requests that go to the API from the Compute Engine web console, but neither has gotten me anywhere.
Finding Tags with regular expression filters
I'm struggling with the same issue but I think that regular expressions solve the problem.
I have many instances with multiple tags but I can search across all tags with the '~' operator e.g. to find all servers with the production tag:
gcloud compute instances list --filter='tags.items~^production$'
For many servers the 'production' tag is the third entry in tags.items yet the regexp finds it.
This seems to work but I can't find any documentation that specifically says that it should work. The nearest is the section on topic filters which mentions this
key ~ value True if key matches the RE (regular expression) pattern
value.
You can also search for multiple tags
gcloud compute instances list --filter='tags.items~^production$ AND tags.items~^european$'
which would find all servers with the two tags 'production' and 'european'
Tags v Custom metadata
If you want something a bit more flexible than tags (which can only be present or missing), you can attach your own custom multi-valued metadata to an instance (via the UI, command-line or API). You can then search for particular values of that item.
For example suppose I have different instances supporting eCommerce for different brands, I could attach a custom 'brand' metadata item to each server and then find all of the servers which run my "Coca-Cola" brands via ..
gcloud compute instances list --filter="metadata.items.key['brand']['value']='Coca-Cola'"
... and my 'Pepsi Cola' servers with ...
gcloud compute instances list --filter="metadata.items.key['brand']['value']='Pepsi Cola'"
Finding Metadata with regular expression filters
You probably guessed this already but the regular expression operator also works with metadata filters so you can do
gcloud compute instances list --filter="metadata.items.key['brand']['value']~'Cola'"
As explained here you can use the following syntax using gcloud cli:
gcloud compute instances list --filter labels.env=dev
Multi Filtering example that I'm using:
gcloud compute instances list --filter="zone:( europe-west1-d )" --filter="name:( testvm )" --filter labels.group=devops --filter labels.environment_type=production
Add instance metadata MyKey=MyValue for gke-cluster-asia-eas-default-pool-dc8f484c-knbs:
gcloud compute instances add-metadata gke-cluster-asia-eas-default-pool-dc8f484c-knbs --metadata=MyKey=MyValue
Display all instances with MyKey=MyValue:
gcloud compute instances list --filter="metadata.items.key['MyKey'][value]='MyValue'"
Display all instances that belong to the cluster cluster-asia-east1-a:
gcloud compute instances list --filter="metadata.items.key['cluster-name'][value]='cluster-asia-east1-a'"
NAME ZONE MACHINE_TYPE PREEMPTIBLE INTERNAL_IP EXTERNAL_IP STATUS
gke-cluster-asia-eas-default-pool-dc8f484c-knbs asia-east1-a n1-standard-1 10.140.0.2 104.155.227.25 RUNNING
gke-cluster-asia-eas-default-pool-dc8f484c-x8cv asia-east1-a n1-standard-1 10.140.0.3 104.199.226.16 RUNNING
gke-cluster-asia-eas-default-pool-dc8f484c-z5wv asia-east1-a n1-standard-1 10.140.0.4 104.199.134.9 RUNNING

Jmeter: How to map specific variable values from CSV file to specific thread-groups in a test plan

I have a test plan with 12 thread-groups, each one is one test scenario.I want to use unique login credentials for each thread-group. So I've created a CSV file, added CSV Data Config element to each thread-group and selected "All Threads" in "Sharing mode". Whenever I execute the test plan(All thread-groups concurrently) the thread-groups are not taking variable rows sequentially. I expected that the 1st thread-group in the test plan would consider 1st row of variables in the CSV file based on the post: JMeter test plan with different parameter for each thread
But it is not happening and I am unable to understand the pattern of variable allocation. Please help me resolve my issue.
My CSV file looks like below:
userName,password,message
userone,sample123,message1
usertwo,sample123,message2
.
.
so on...
Refer below for configuration of CSV Data Config element:
Thanks!
Threads and thread groups are different things. When you choose "All Threads" in "Sharing mode", it just means that all threads in the same thread group will share CSV. Thread groups are always independent.
You have 2 simple options:
Use one thread group and control what users are doing with controllers. For example Throughput Controller can allow you to control how many threads perform this or other script scenario within the same thread group.
Split your CSV so, that each thread group has its own CSV.
And many more complicated options, for example:
Use __CSVRead or __StringFromFile function, which allows to read one line. That way you can assign each thread group a range of lines to read, rather than reading the entire file.
If your usernames and passwords are predictable (e.g. user1, user2, etc), you could use a counter and a range for each thread group.

Create n agents and calculate average number

I want to create system of n agents. All agents are generate random Integer value. My goal is calculating average of these n numbers.
My simple idea of algorithm:
Every Agent sends message with its number to other agents
Every Agent calculates average number
Problems:
I just can't understand how I can create a variable number of agents
How I can take output result
Maybe somebody know how I can do this?
The examples online tend to focus on using the Boot class:
java -cp jade.jar jade.Boot -agents agentName:org.agents.MyAgentClass
You could spawn more agents simply by adding more to the -agents option command-line args (separated by semi-colons):
java -cp jade.jar jade.Boot -agents \
agent1:org.agents.MyAgentClass;agent2:org.agents.MyAgentClass
If you need a variable number of agents, you could move this to a bash script that appends more agents depending on a parameter.
If you really want to go crazy, you can create your own container and add agents to it from your own code and bypass the Boot class. Since your use case is so simple, I don't know that this would be a good way to go yet.

Retrieving multiple objects from LDAP by DN at once

I have a list of DNs, and for performance reasons I want to retrieve the attributes of every DN in the list in a single trip to the LDAP server.
Seems like searching by DN, i.e., using DN as a filter search, is not possible
Using DN in Search Filter
http://www.openldap.org/lists/openldap-software/200503/msg00520.html
....is there any alternative?
Sure you can.
ldapsearch -h <ldaphost> -b "cn=joe,dc=yourdoamin,dc=com" -s base -D cn=admin,dc=yourdomain,dc=com -W "(objectclass=*)" "*"
Will retrieve all user attributes for the DN: cn=joe,dc=yourdoamin,dc=com.
But, for the list, you would need to repeat the search for each one.
We often do this in a bash script.
Can you use a filter to identify which DNs you need?
-jim
Seems like it is only possible in Active Directory. All I had to do is filter by the distinguishedName attribute, however on my tests there was no performance gain.
Active Directory includes the distinguishedName attribute on every
object; the value is the object's DN. The following example elaborates
the previous example to include a value of distinguishedName on each
object.
http://msdn.microsoft.com/en-us/library/cc223167.aspx