ZooKeeper Multi-Server Setup by Example - configuration

From the ZooKeeper multi-server config docs they show the following configs that can be placed inside of zoo.cfg (ZK's config file) on each server:
tickTime=2000
dataDir=/var/zookeeper/
clientPort=2181
initLimit=5
syncLimit=2
server.1=zoo1:2888:3888
server.2=zoo2:2888:3888
server.3=zoo3:2888:3888
Furthermore, they state that you need a myid file on each ZK node whose content matches one of the server.id values above. So for example, in a 3-node "ensemble" (ZK cluster), the first node's myid file would simply contain the value 1. The second node's myid file would contain 2, and so forth.
I have a few practical questions about what this looks like in the real world:
1. Can localhost be used? If zoo.cfg has to be repeated on each node in the ensemble, is it OK to define the current server as localhost? For example, in a 3-node ensemble, would it be OK for Server #2's zoo.cfg to look like:
tickTime=2000
dataDir=/var/zookeeper/
clientPort=2181
initLimit=5
syncLimit=2
server.1=zoo1:2888:3888
server.2=localhost:2888:3888 # Afterall, we're on server #2!
server.3=zoo3:2888:3888
Or is this not advised/not possible?
2. Do they server ids have to be numerical? For instance, could I have a 5-node ensemble where each server's zoo.cfg looks like:
tickTime=2000
dataDir=/var/zookeeper/
clientPort=2181
initLimit=5
syncLimit=2
server.red=zoo1:2888:3888
server.green=zoo2:2888:3888
server.blue=zoo3:2888:3888
server.orange=zoo1:2888:3888
server.purple=zoo2:2888:3888
And, say, Server 1's myid would contain the value red inside of it (etc.)?

1. Can localhost be used?
This is a good question as ZooKeeper docs don't make it cristal clear whether the configuration file only accepts IP addresses. It says only hostname which could mean either an IP address, a DNS, or a name in the hosts file, such as localhost.
server.x=[hostname]:nnnnn[:nnnnn], etc
(No Java system property)
servers making up the ZooKeeper ensemble. When the server starts up, it determines which server it is by looking for the file myid in the data directory. That file contains the server number, in ASCII, and it should match x in server.x in the left hand side of this setting.
However, note that ZooKeeper recommend to use the exactly same configuration file in all hosts:
ZooKeeper's behavior is governed by the ZooKeeper configuration file. This file is designed so that the exact same file can be used by all the servers that make up a ZooKeeper server assuming the disk layouts are the same. If servers use different configuration files, care must be taken to ensure that the list of servers in all of the different configuration files match.
So simply put the machine IP address and everything should work. Also, I have personally tested using 0.0.0.0 (in a situation where the interface IP address was different from the public IP address) and it does work.
2. Do they server ids have to be numerical?
From ZooKeeper multi-server configuration docs, myid need to be a numerical value from 1 to 255:
The myid file consists of a single line containing only the text of that machine's id. So myid of server 1 would contain the text "1" and nothing else. The id must be unique within the ensemble and should have a value between 1 and 255.
Since myid must match the x in server.x parameter, we can infer that x must be a numerical value as well.

Related

How to retrieve lost file from IPFS?

Couple weeks ago I try spin up a local IPFS node, publish a file, and were able to access it via publish gateway. I thought the file would have been store by a lots of nodes, so I deleted it from my local machine, now I can't access the file via the ID (QmNvxsaXqWoLR1NNJpiRXTEo57ptyg3CjSGBrgeyiyFiPm) anymore.
I noticed that I can still somehow access the data from the webui, but only able to see the raw data instead of the files. Is there any way to retrieve the file?
I actually can retrieve this CID via a simple ipfs cat QmNvxsaXqWoLR1NNJpiRXTEo57ptyg3CjSGBrgeyiyFiPm:
{
"0x9a39f286e1cd710da14e45ac124e38f2b6242622": "4.705",
"0x7c981d31b2ab65ce9f9cce49feac9e9e11e8ca64": "0.174481",
"0xa83cdaaadbb0e01d5de8df4a670947eacbb11f7e": "0.860812",
"0x445f4b54039cb1f86644351f2ef324c6876f6d76": "0.036128",
"0x29eab4341629aa1ae5e996f76ea0750548311ecf": "5.4",
"0xbbccf6cab5b3aec26b0cbc6095b5b6ddbacfd59a": "17.172011",
"0x33d5ae030cf11723f9b34ecc6fe5cfe00c6dc133": "0.001909",
"0x03886228bb749eeba43426d2d6b70eba472f4876": "6.8",
"0x1eb8e88a563fde7b3b8ebbbb0e1ac117c3d80800": "1821.138157",
"0x62ba33ccc4a404456e388456c332d871dae7ae9e": "0.000145",
"0x63e62588330657c99ba79139e7c21af0c0db1e7e": "12.560212",
"0xcd45fdaa6a72740e1d092f458213ff39d3d94a10": "280.592062",
"0xb92667e34cb6753449adf464f18ce1833caf26e0": "0.647424",
"0x9a5179e08acf37b3d84c9a0c0d6f3ea2417f9175": "10.097725",
"0xc43cffc5db578879cc5d0d4cfe07ad514c934d3b": "6.365907",
"0x34915628fc56ae8ff6684be39462e7ba398164b8": "0.00069",
"0x47e2bc7475ef8a9a5e10aef076da53d3c446291e": "5.305",
"0xf432d70c941ebe657ca8cff0b70d1649d5781eea": "0.153823",
"0xff90d66d41fc97b223e8005dba51635b5d49632b": "0.002298",
"0x1cf41ad63f67f3e7f8a1db240d812f5392b9a9c4": "6.05013",
"0xc418aaa0d1e018ded3efc0f72a089519b3d58683": "0.179902",
"0x7d209486a3562fe406b72d65b3703884c50bac81": "2.191224",
"0xe782657a1043062087232b3c20c4d25e2a982cb3": "0.110927",
"0xd998e5a4777e1b47c1441a88bb553cbf16802e4c": "0.095045",
"0x9f3ef50ea64adad5b33f1f8222760cfbf42007f7": "0.069055",
"0x40c1efa324fd80329117409c65081f13e7a08a42": "2790.399058",
"0x9ef8c5ae4a320ef0984695af9a85d07f5be13792": "0.139741",
"0xf46422c1b6c2135dbca9b55771fd6e7869a8691c": "995.479262",
"0xf6f3bc09782d3c0df474eb3cec5cac8423bfedf3": "0.00012",
"0x4f2769e87c7d96ed9ca72084845ee05e7de5dda2": "0.000509",
"0x92f1e9a52c1a81fdb76ee6477c0c605917cddbe5": "0.811623",
"0x1e6424a481e6404ed2858d540aec37399671f5e0": "19253.760913",
"0xc9b2c3a6a8e1896aadcf236b88019c7574d75069": "781.127767",
"0xb08f95dbc639621dbaf48a472ae8fce0f6f56a6e": "34.704074"
}
I thought the file would have been store by a lots of nodes, so I deleted it from my local machine
It's important to note that data is only stored by other nodes temporarily if they access the content themselves. If you want data to live reliably longterm, you can use a pinning service like Pinata, as you're paying them to keep your data pinned.
Otherwise you have to rely on other nodes pinning your data to ensure it remains available.

How to interpret tcl command in openOCD manual

I'm completely new to tcl and am trying to understand how to script the command "adapter usb location" in openOCD.
From the openOCD manual, the command has this description:
I want to point it to the port with the red arrow below:
Thanks.
It's not 100% clear, but I would expect (from that snippet of documentation) a bus location to be a dotted “path” something like:
1-6
where the values are:
1 — Bus ID
6 — Port ID
Which would result in a call to the command being done like this:
adapter usb location 1-6
When there's a more complex structure involved (internally because of chained hubs) such as with the item above the one you pointed at, I'd instead expect:
1-5.3
Notice that there are is a sequence of port IDs (5.3) in there to represent the structure. The resulting call would then be:
adapter usb location 1-5.3
Now for the caveats!
I can't tell what the actual format of those IDs is. They might just be numbers, or they might have some textual prefix (e.g., bus1-port6). Those text prefixes, if present, might contain a space (or other metacharacter) which will be deeply annoying to use if true. You should be able to run adapter usb location without any other arguments to see what the current location is; be aware though that it might return the empty string (or give an error) if there is no current location. I welcome feedback on this, as that information appears to be not present in any online documentation I can find (and I don't have things installed so I can't just check).
I also have no idea what (if anything) to do with the device and interface IDs.

get host name for jinja template in salt

not sure where to start but hre is what i have and what i'm trying to do.
what i have.
i have three Minions as part of three tier application named employee.
there is a three servers called web01 as web server, app01 as app server and a db01 as database server.
each server has a grains value on it,
here is each server and the grains values and keys of these values.
web01.
grains value =
appname:employee and
tier:web
app01.
grains value =
appname:employee and
tier:app
db01.
grains value =
appname:employee and
tier:db
what i'm trying to do.
i'm trying to push configurations files on web01 and app01, these config files has a variables (hostname of another tier minion).. the config on the web01 should have the name app01.. and the config on app01 should have the name db01.. the name of these severs should be grabbed based on the grains value.
for example.
the host name of the app server, its the server that has grains value equal to "appname:employee and tier:app"
not sure how to do it.
too new to salt and i dont have much experiance with it nor jinja template.
any help will be really appreciated.
Thank you
So if I understand you right, you want the config file to be on web1 and app1 containing all hostnames.
If so, you can use a pillar file where you state these attributes.
/srv/pillar/employee.sls:
employee:
hostname_of_another_tier_minion: hostname.example.com
You can then reference this in your jinja template /srv/formulas/employee/templates/config.conf.jinja:
----------
hostname_of_another_tier_minion {{ pillar['employee']['hostname_of_another_tier_minion'] }}
Just to be complete you reference your template in /srv/employee/web.sls and /srv/employee/app.sls:
web-config-file:
file.managed:
- user: root
- group: root
- template: jinja
- mode: '0644'
- names:
- /etc/<web-conf-dir>/web.conf:
- source: salt://employee/templates/config.conf.jinja
Let me know if you have any further questions.
UPDATE:
If the hostnames are unknown as you said, you can first get them with grains and then put them in the jinja template that gets rendered into a config on every server.

Partitioning data across hosts in Ansible (access "index" of host in task?)

I am trying to use Ansible to do some parallel computation. My data is trivially parallelizable, I just need to split the file across my hosts (EC2 instances). Is there a canonical way to do this?
The next best thing would be to have a counter that increments for each host. Assuming I have already split my data into my number of workers, I would like to be able to say within each worker task:
- file: src=data/users-{{host_index}}.csv dest=/mnt/users.csv`.
Then, each worker can process their copy of users.csv with a separate script, that is agnostic to which set of users they have. Is there any way to get this counter index?
I am a beginner to Ansible, so I wonder whether I am overlooking a simple module or idiom, either in Ansible or Jinja. Thanks in advance.
It turns out I have access to a variable called ami_launch_index inside of the ec2_facts module that gives me a zero-indexed unique ID to each EC2 instance. Here is the code for copying over files with numerical suffixes to their corresponding EC2 instances:
tasks:
- name: Gather ec2 facts
action: ec2_facts
register: facts
- name: Share data to nodes
copy: src=data/websites-{{facts.ansible_facts.ansible_ec2_ami_launch_index}}.txt dest=/mnt/websites.txt
The copy line produces the following for the src values:
data/websites-1.txt
data/websites-0.txt
data/websites-2.txt
(There is no guarantee that the hosts will iterate in ami_launch_index order)

Requesting nodes by numbers and their names in SGE

How to request the number of nodes (not procs), while job submission in SGE?
for e.g. In TORQUE, we can specify qsub -l nodes=3
How to request the nodes by their names in SGE?
for e.g. In TORQUE, we can do this by qsub -l nodes=abc+xyz+pqr, where abc, xyz and pqr are hostnames
For single hostname, qsub -l hostname=abc it works. But how do I delimit multiple hostnames in SGE?
Requesting the number of nodes with Grid Engine is done indirectly.
When you want to submit a parallel job then you have to request
a parallel environment (man sge_pe) together with the amount
of slots (processors etc) like qsub -pe mytestpe 12...
Depending on the allocation_rule defined in the parallel environment
(qconf -sp mytestpe) the slots are distributed over one or more
nodes. If you have a so called fixed allocation rule where you
just add a certain number as allocation rule like 4 (4 slots per
host) it is easy. If you like one host just submit with -pe mytestpe 4
if you want 10 nodes just submit with -pe mytestpe 40.
Node name can be requested by the -l h=abc. Since node names are
RESTRINGS (regular expression strings) in Grid Engine you can create
a regular expression for host filtering: qsub -l h="abc|xyz".
You can also create host groups (qconf -ahgrp) and request
so called queue domains (qsub -q all.q##mygroup).
Daniel
http://www.gridengine.eu
you can use -tc to limit the number of concurrent tasks (i.e., number of slots that will be used for an array job). I use this when I submit array jobs with 100 sub-jobs to limit the impact on our queue, defaulting to 10 simultaneous jobs with -tc 10. As each job finishes, another array job from the pending pool will be submitted.
the only way I've been able to figure out to do this would be to set up specific resource quota sets (using qconf -mrqs) specifying the particular host groups you want to use. You would have to set up all of the combinations that you want, first. I don't see a real reason to specify specific hosts, though, unless these hosts have specific resources that you want to use (in which case, I'd set up consumable resources for those and apply the appropriate number of resources to each host that can supply them, then use that instead of specifying the specific hosts for a particular job).