Multiple slaves on a single machine with hudson - configuration

Can I run multiple hudson slaves on a single machine, I mean real slaves with only one build process?
My problem is, I have a slave with 3 build processes, using locks-and-latches (V0.4) to run three different kinds of build jobs. But sometimes I have the problem that more than one build job of one kind runs at the same time, or it blocks the build process from the slave and doesn't run.
Thank you in advance for your insights.

Yes, Hudson should be capable of running multiple slaves on a single machine. I do a limited form of this with my builds so that each job runs on a separate hard drive. In my case, this means I have a master, with a slave that is run on the same machine as the master. Having 3 slaves each with 1 executor could be done instead of one slave with 3 executors could be done, but it shouldn't impact locking so I only see a use for that if you have different physical drives and want more throughput.
I believe locks in both Hudson (i.e. this job is running) and locks-and-latches (this lock is in use) span all slaves & the master for a given hudson setup. So if slave 1 is running a job that holds Lock A, slave 2 won't be able to start a job that holds lock A either. It isn't entirely clear to me if this is the behavior you're seeking.
There is one important note, though:
Supposedly there is currently a bug in the hudson core that sometimes allows multiple jobs to start with the same lock when using the locks-and-latches plugin. I am not an expert on the internals of Hudson locking, nor the locks-and-latches plugin, but if you want a more in-depth explanation there is a conversation that sounds related on the hudson users mailing list (users#hudson.dev.java.net).
here is the archived conversation
The author of the locks-and-latches plugin is usually pretty responsive to questions.

Related

Reason for Multiple MySQL Slave Databases

For my application I will have one master db with one slave db, the slave will be used for me to run my backups on without interrupting my application. However I have seen examples with one master with multiple slaves and I am wondering why and if my application would benefit from having more than one slave that I have not thought of.
So put simply, what could be the reasons for having more than one slave?
Multiple slaves allows you to distribute your reads. If you have a read heavy application, you can scale it with multiple slave servers. It also offers a layer of fault tolerance - if your master dies, you can promote one of the slaves to be the master.

Snapshot of EBS volume used for replication

I setup an EC2 instance with MySQL on EBS volume and setup another instance which acts as Slave for Replication. The replication set up was fine. My question is about taking snapshots of these volumes. I noticed that the tables need to be locked for the snapshot process which may cause inconvenience for the users. So, my idea is to leave the Master instance alone and take a snapshot of instance acting as slave. Is this a good idea? Is there anyone out with a similar setup and could guide me in a right way?
Also, taking snapshot of slave instance would require locking of tables. Would that mean replication will break?
Thanks in advance.
Though it's a good idea to lock the database and freeze the file system when you initiate the snapshot, the actual API call to initiate the snapshot takes a fraction of a second, so your database and file system aren't locked/frozen for long.
That said, there are a couple other considerations you did not mention:
When you attempt to create the lock on the database, it might need to wait for other statements to finish before the lock is granted. During this time, your pending lock might further statements to wait until you get and release the lock. This can cause interruptions in the flow of statements on your production database.
After you initiate the creation of the snapshot, your application/database is free to use the file system on the volume, but if you have a lot of writes, you could experience high iowait, sometimes enough to create a noticeable slowdown of your application. The reason for this is that the background snapshot process needs to copy a block to S3 before it will allow a write to that block on the active volume.
I solve the first issue by requesting a lock and timing out if it is not granted quickly. I then wait a bit and keep retrying until I get the lock. Appropriate timeouts and retry delay may vary for different database loads.
I solve the second problem by performing the frequent, consistent snapshots on the slave instead of the master, just as you proposed. I still recommend performing occasional snapshots against the master simply to improve its intrinsic durability (a deep EBS property) but those snapshots do not need to be performed with locking or freezing as you aren't going to use them for backups.
I also recommend the use of a file system that supports flushing and freezing (XFS). Otherwise, you are snapshotting locked tables in MySQL that might not yet even have all their blocks on the EBS volume yet or other parts of the file system might be modified and inconsistent in the snapshot.
If you're interested, I've published open source software that performs the best practices I've collected related to creating consistent EBS snapshots with MySQL and XFS (both optional).
http://alestic.com/2009/09/ec2-consistent-snapshot
To answer your last question, locking tables in the master will not break replication. In my snapshot software I also flush the tables with read lock to make sure that everything is on the disk being snapshotted and I add the keyword "LOCAL" so that the flush is not replicated to any potential slaves.
You can definitely take a snapshot of the slave.
From your description, it does not seem like the slave is being used operationally.
If this is the case, then the safest method of obtaining a reliable volume snapshot would be to:
Stop mysql server on the slave
start the snapshot (either through the AWS Console, or by command line)
When the snapshot is complete, restart mysqld on the slave server

Hudson/Jenkins - Run steps in master and slave under the same job

I have a master and slave machines and one job.
This job should have two steps: One to run Unit tests on the master machine
and the other to run some executable laying in the slave machine.
Can this be done under one job? I know that I can restrict the job to run in slave only
but i couldn't find a way to restrict in the step level.
AS far as i know you can only bind a job to a particular node but not parts of the job.

How to prevent certain Jenkins jobs from running simultaneously?

I have a couple of jobs that use a shared resource (database), which sometimes can cause builds to fail in the (rare) event that the jobs happen to get triggered simultaneously.
Given jobs A through E, for example, is there any way to specify that A and C should never be run concurrently?
Other than the aforementioned resource, the builds are independent of each other (not e.g. in a upstream/downstream relation).
A "brute-force" way would be limiting number of executors to one, but that obviously is less than ideal if most jobs could well be executed concurrently and there's no lack of computing resources on the build server.
There are currently 2 ways of doing this:
Use the Throttle Concurrent Builds plugin.
Set up those jobs to run on a slave having only 1 executor.
The Locks and Latches plugin here should help.
This question is probably a dupe of How do I ensure that only one of a certain category of job runs at once in Hudson?
That's an old question, but the topic can still be relevant, especially when running application tests on Jenkins.
The Lockable Resources Plugin allows you to define lockable resources that can be used by builds. If your build requires an resource, it takes the lock. If a second build requires the same resource (which then is already locked), it will be queued for the resource to be free.
Although the docs use computers or printers as examples for lockable resources, the database example from above should work as well.
In opposite to the Locks and Latches Plugin mentioned in answers from 2012, this package seems to be currently maintained (currently ~2016).
Have a look at the External Resource Dispatcher Jenkins plugin, which was first published in November 2012. This (relatively) new plugin seems to exactly cover this use case.
N.B. you don't need physical or virtual hardware for a slave/node, you can set up "slaves" that run on the master server.
Manage Jenkins > Manage Nodes > New node
and make a "dumb slaves" each with its own root directory.
Create a few slaves, execute them when the server boots, and then you have essentially created pools of executors.
You might have, say...
db - only one executor in your case.
compile - limit according to hardware or # of CPUs.
scripts - have many executors for all those little jobs that Jenkins is good at doing.
Old question, and whether this will work for your application I can't be sure as you didn't mention details of your application. However, I wanted to add the way that I handled this in our Rails application test suite.
Our application's database configuration (database.yml) isn't in the source repository. Instead, it lives in /var/lib/configs/uniquing_database.yml on the VM which runs our Jenkins instance.
One of the steps of our build process involves copying this config file to the project workspace:
cp /var/lib/jenkins/configs/myapp_unique_database.yml config/database.yml
and that config takes workspace and build number information exposed to the environment by Jenkins into account in order to create a uniquely named database for that job and it's specific execution:
test:
adapter: postgresql
encoding: unicode
host: 127.0.0.1
port: 5432
database: myapp_test<%= ENV['JOB_NAME'].split('/').last %><%= ENV['BUILD_NUMBER'] %>
The rest of our build proceeds without any knowledge or care that it's running in a distinct database. Finally, at the end of our build, we make sure to drop that database so we don't have a bunch of test databases polluting the file system:
RAILS_ENV=test bundle exec rake db:drop

Reconfigure and reboot a Hudson/Jenkins slave as part of a build

I have a Jenkins (Hudson) server setup that runs tests on a variety of slave machines. What I want to do is reconfigure the slave (using remote APIs), reboot the slave so that he changes take effect, then continue with the rest of the test. There are two hurdles that I've encountered so far:
Once a Jenkins job begins to run on the slave, the slave cannot go down or break the network connection to the server otherwise Jenkins immediately fails the test. Normally, I would say this is completely desirable behavior. But in this case, I would like for Jenkins to accept the disruption until the slave comes back online and Jenkins can reconnect to it - or the slave reconnects to Jenkins.
In a job that has been attached to the slave, I need to run some build tasks on the Jenkins master - not on the slave.
Is this possible? So far, I haven't found a way to do this using Jenkins or any of its plugins.
EDIT - Further Explanation
I really, really like the Jenkins slave architecture. Combined with the plugins already available, it makes it very easy to get jobs to a slave, run, and the results pulled back. And the ability to pick any matching slave allows for automatic job/test distribution.
In our situation, we use virtualized (VMware) slave machines. It was easy enough to write a script that would cause Jenkins to use VMware PowerCLI to start the VM up when it needed to run on a slave, then ship the job to it and pull the results back. All good.
EXCEPT Part of the setup of each test is to slightly reconfigure the virtual machine in some fashion. Disable UAC, logon as a different user, have a different driver installed, etc - each of these changes requires that the test VM/slave be rebooted before the changes take affect. Although I can write slave on-demand scripts (Launch Method=Launch slave via execution of command on the master) that handle this reconfig and restart, it has to be done BEFORE the job is run. That's where the problem occurs - I cannot configure the slave that early because the type of configuration changes are dependent on the job being run, which occurs only after the slave is started.
Possible Solutions
1) Use multiple slave instances on a single VM. This wouldn't work - several of the configurations are mutually exclusive, but Jenkins doesn't know that. So it would try to start one slave configuration for one job, another slave for a different job - and both slaves would be on the same VM. Locks on the jobs don't prevent this since slave starting isn't part of the job.
2) (Optimal) A build step that allows a job to know that it's slave connection MIGHT be disrupted. The build step may have to include some options so that Jenkins knows how to reconnect the slave (will the slave reconnect automatically, will Jenkins have to run a script, will simple SSH suffice). The build step would handle the disconnect of the slave, ignore the usually job-failing disconnect, then perform the reconnect. Once the slave is back up and running, the next build step can occur. Perhaps a timeout to fail the job if the slave isn't reconnectable in a certain amount of time.
** Current Solution ** - less than optimal
Right now, I can't use the slave function of Jenkins. Instead, I use a series of build steps - run on the master - that use Windows and PowerShell scripts to power on the VM, make the configurations, and restart it. The VM has a SSH server running on it and I use that to upload test files to the test VM, then remote execute them. Then download the results back to Jenkins for handling by the job. This solution is functional - but a lot more work than the typical Jenkins slave approach. Also, the scripts are targeted towards a single VM; I can't easily use a pool of slaves.
Not sure if this will work for you, but you might try making the Jenkins agent node programmatically tell the master node that it's offline.
I had a situation where I needed to make a Jenkins job that performs these steps (all while running on the master node):
revert the Jenkins agent node VM to a powered-off snapshot
tell the master that the agent node is disconnected (since the master does not seem to automatically notice the agent is down, whenever I revert or hard power off my VMs)
power the agent node VM back on
as a "Post-build action", launch a separate job restricted to run on the agent node VM
I perform the agent disconnect step with a curl POST request, but there might be a cleaner way to do it:
curl -d "offlineMessage=&json=%7B%22offlineMessage%22%3A+%22%22%7D&Submit=Yes" http://JENKINS_HOST/computer/THE_NODE_TO_DISCONNECT/doDisconnect
Then when I boot the agent node, the agent launches and automatically connects, and the master notices the agent is back online (and will then send it jobs).
I was also able to toggle a node's availability on and off with this command (using 'toggleOffline' instead of 'doDisconnect'):
curl -d "offlineMessage=back_in_a_moment&json=%7B%22offlineMessage%22%3A+%22back_in_a_moment%22%7D&Submit=Mark+this+node+temporarily+offline" http://JENKINS_HOST/computer/NODE_TO_DISCONNECT/toggleOffline
(Running the same command again puts the node status back to normal.)
The above may not apply to you since it sounds like you want to do everything from one jenkins job running on the agent node. And I'm not sure what happens if an agent node disconnects or marks itself offline in the middle of running a job. :)
Still, you might poke around in this Remote Access API doc a bit to see what else is possible with this kind of approach.
Very easy. You create a Master job that runs on the Master, from the master job you call the client job as a build step (it's a new kind of build step and I love it). You need to check that the master job should wait for the client job to finish. Then you can run your script to reconfigure your client and run the second test on the client.
An even better strategy is to have two nodes running on your slave machines. You need to configure two nodes in Jenkins. I used that strategy successfully with a unix slave. The reason was that I needed different environment variables to be set up and I didn't wanted to push that into the jobs. I used ssh clients, so I don't know if it is possible with different client types. Than you might be able to run both tests at the same time or you chain the jobs or use the master strategy mentioned above.