wait for the end of another job before starting one - hudson

I have two jobs on Hudson, A & B.
I just want to put B in queue if a build of A is already on-going.
Actually I would like to set A as upstream project of B but without setting A as downstream project (plus "Block build when upstream project is building" advanced option), because I don't need to build B each time A is triggered.
As B build step is a python script, I know I can poll
http://myhudson/srs/job/A/api/json?tree=builds[building]
periodically and wait for a true value in top building result, but during this polling, B will be "in progress", and it would be clearer to just have B in queue.
Any suggestions?
Thanks!

One approach would be to use the Exclusion Plugin and set up the two jobs to be mutually exclusive.

Related

How to supress multiple alerts and receive only one notification in Zabbix?

I'm looking for a way to create a Zabbix monitor that will only alert when we have a bunch of other alerts on board.
For example, I have created alerts A, B, C, and so on, and when they appear separately, it is not a big deal, but if together, I would like to know and receive a notification to act accordingly.
Therefore, I wonder if it's feasible to design an alert D that only appears when all the others do.
I have only found a solution using dependent triggers, but it's not suitable in such cases.
If the items are in the same host, just add a higher severity trigger when all items breach their threshold.
If the items are in different hosts, you can add a "parent" host, with the metrics you need. Example: in a 8 nodes cluster, each cluster node will have a Warning severity problem when it's offline, and the parent host will have a High severity problem when more than 4 nodes are offline.
What you are looking for is Trigger Dependency. You can update the current trigger and add dependency with other triggers.
In this case your trigger gets suppressed if you depended triggered is already fired.

Live/Reactive Aggregation in FeathersJS

I am trying to implement a real-time app with FeathersJS and Feathers-Vuex.
A simple example - A todo app where users can add goals, add tasks to goals, and an effort (1-5) to each task. I want to find out the total effort needed for the goal. Anytime the effort of a task changes (CRUD), the effort of the goal gets updated.
So something like -
Goal: G1 (11/15)
Tasks: T1 (4/5), T2 (2/3), T3 (5/5)
How do I calculate this and keep it in sync in FeathersJS + FeathersVuex?
What I've tried so far -
FastJoin to populate this data to each goal record but it wasn't reactive - Not sure how to "listen" to changes to "Tasks"
Hooks and storing stats in the database - It worked for a while but started getting out of hands (when the more services were involved in calculations) and I ended up with too much coupling.
Loading everything in Vuex and calculating it on the front-end - Worked well for prototyping but doesn't work for actual cases where there would be too many records to be able to pre-load everything.
A custom Service - "GoalStats" which calculates the relevant stats by aggregating things from multiple services [Read that it is the recommended way for this]
What I can't seem to figure out is - how to keep things reactive when data is computed/aggregated across services. (Eg. Adding a new task in the above example changes the goal priority)
Still relatively new to FeathersJS and FeathersVuex - so not really sure what I am missing here.
Update
Here is what I have settled on for now - Use hooks from all the dependent services to trigger an empty PATCH request on the Custom Service (if needed). In FeathersVuex, I have added the service and it gets updated.
So, in context of the example above, I am using the before and after hooks of Tasks service to check if Effort value is being changed, or if the task is being added/removed. If so, I dispatch a PATCH request which calls the GET behind the scenes in my custom service, and recalculates the stats which then follows the existing events flow.
Not sure if there is a better way to go about this and/or if there are best practices around managing these cross-service "triggers"

Dealing with exceptions in an event driven world

I'm trying to understand how exceptions are handled in an event driven world using micro-services (using apache kafka). For example, if you take the following order scenario whereby the following actions need to happen before the order can be completed.
1) Authorise the payment with the payment service provider
2) Reserve the item from stock
3.1) Capture the payment with the payment service provider
3.2) Order the item
4) Send a email notification accepting the order with a receipt
At any stage in this scenario, there could be a failure such as:
The item is no longer in stock
The payment information was incorrect
The account the payee is using doesn't have the funds available
External calls such as those to the payment service provider fail, such as downtime
How do you track that each stage has been called for and/or completed?
How do you deal with issues that arise? How would you notify the frontend of the failure?
Some of the things you describe are not errors or exceptions, but alternative flows that you should consider in your distributed architecture.
For example, that an item is out of stock is a perfectly valid alternative flow in your business process. One that possibly requires human intervention. You could move the message to a separate queue and provide some UI where a human operator can deal with the problem, solve it and cause the flow of events to continue.
A similar thing could be said of the payment problems you describe. If an order cannot successfully be settled, a human operator will need to investigate the case and solve it. For that matter, your design must contemplate that alternative flow as part of it, and make it so a human can intervene somehow when the messages end up in a queue that requires a person to review them.
Those cases should be differentiated from errors or exceptions being thrown by the program. Those cases, depending on the circumstance, might in fact require to move the message to a dead letter queue (DLQ) for an engineer to take a look at them.
This is a very broad topic and entire books could written about this.
I believe you could probably benefit from gaining more understanding of concepts like:
Compensating Transactions Pattern
Try/Cancel/Confirm Pattern
Long Running Transactions
Sagas
The idea behind compensating transactions is that every ying has its yang: if you have one transaction that can place an order, then you could undo that with a transaction that cancels that order. This latter transaction is a compensating transaction. So, if you carry out a number of successful transactions and then one of them fails, you can trace back your steps and compensate every successful transaction you did and, as a result, revert their side effects.
I particularly liked a chapter in the book REST from Research to Practice. Its chapter 23 (Towards Distributed Atomic Transactions over RESTful Services) goes deep in explaining the Try/Cancel/Confirm pattern.
In general terms it implies that when you do a group of transactions, their side effects are not effective until a transaction coordinator gets a confirmation that they all were successful. For example, if you make a reservation in Expedia and your flight has two legs with different airlines, then one transaction would reserve a flight with American Airlines and another one would reserve a flight with United Airlines. If your second reservation fails, then you want to compensate the first one. But not only that, you want to avoid that the first reservation is effective until you have been able to confirm both. So, initial transaction makes the reservation but keeps its side effects pending to confirm. And the second reservation would do the same. Once the transaction coordinator knows everything is reserved, it can send a confirmation message to all parties such that they confirm their reservations. If reservations are not confirmed within a sensible time window, they are automatically reversed by the affected system.
The book Enterprise Integration Patterns has some basic ideas on how to implement this kind of event coordination (e.g. see process manager pattern and compare with routing slip pattern which are similar ideas to orchestration vs choreography in the Microservices world).
As you can see, being able to compensate transactions might be complicated depending on how complex is your distributed workflow. The process manager may need to keep track of the state of every step and know when the whole thing needs to be undone. This is pretty much that idea of Sagas in the Microservices world.
The book Microservices Patterns has an entire chapter called Managing Transactions with Sagas that delves in detail on how to implement this type of solution.
A few other aspects I also typically consider are the following:
Idempotency
I believe that a key to a successful implementation of your service transactions in a distributed system consists in making them idempotent. Once you can guarantee a given service is idempotent, then you can safely retry it without worrying about causing additional side effects. However, just retrying a failed transaction won't solve your problems.
Transient vs Persistent Errors
When it comes to retrying a service transaction, you shouldn't just retry because it failed. You must first know why it failed and depending on the error it might make sense to retry or not. Some types of errors are transient, for example, if one transaction fails due to a query timeout, that's probably fine to retry and most likely it will succeed the second time; but if you get a database constraint violation error (e.g. because a DBA added a check constraint to a field), then there is no point in retrying that transaction: no matter how many times you try it will fail.
Embrace Error as an Alternative Flow
As mentioned at the beginning of my answer, not everything is an error. Some things are just alternative flows.
In those cases of interservice communication (computer-to-computer interactions) , when a given step of your workflow fails, you don't necessarily need to undo everything you did in previous steps. You can just embrace error as part of you workflow. Catalog the possible causes of error and make them an alternative flow of events that simply requires human intervention. It is just another step in the full orchestration that requires a person to intervene to make a decision, resolve an inconsistency with the data or just approve which way to go.
For example, maybe when you're processing an order, the payment service fails because you don't have enough funds. So, there is no point in undoing everything else. All we need is to put the order in a state that some problem solver can address it in the system and, once fixed, you can continue with the rest of the workflow.
Transaction and Data Model State are Key
I have discovered that this type of transactional workflows require a good design of the different states your model has to go through. As in the case of Try/Cancel/Confirm pattern, this implies initially applying the side effects without necessarily making the data model available to the users.
For example, when you place an order, maybe you add it to the database in a "Pending" status that will not appear in the UI of the warehouse systems. Once payments have been confirmed the order will then appear in the UI such that a user can finally process its shipments.
The difficulty here is discovering how to design transaction granularity in way that even if one step of your transaction workflow fails, the system remains in a valid state from which you can resume once the cause of the failure is corrected.
Designing for Distributed Transactional Workflows
So, as you can see, designing a distributed system that works in this way is a bit more complicated than individually invoking distributed transactional services. Now every service invocation may fail for a number of reasons and leave your distributed workflow in a inconsistent state. And retrying the transaction may not always solve the problem. And your data needs to be modeled like a state machine, such that side effects are applied but not confirmed until the entire orchestration is successful.
That‘s why the whole thing may need to be designed in a different way than you would typically do in a monolithic client–server application. Your users may now be part of the designed solution when it comes to solving conflicts, and contemplate that transactional orchestrations could potentially take hours or even days to complete depending on how their conflicts are resolved.
As I was originally saying, the topic is way too broad and it would require a more specific question to discuss, perhaps, just one or two of these aspects in detail.
At any rate, I hope this somehow helped you with your investigation.

Queue data publish/duplicate

I am using IBM webSphere MQ 7.5 server as queue manager for my applications.
Already I am receiving data through single queue.
On the other hand there are 3 applications that want to process data.
I have 3 solutions to duplicate/distribute data between them.
Use broker to duplicate 1 to 3 queue - I don't have broker so it is not accessible for me.
Write an application to get from queue and put them in other 3 queues on same machine
Define publish/subscribe definitions to publish input queue to 3 queues on same machine.
I want to know which methods (2 & 3) is preferred and have higher performance and acceptable operational management effort.
Based on the description I would say that going PubSub would achieve the goal; try to thinking in pure PubSub terms rather than thinking about the queues. i.e. You have an application that publishes to a topic with then 3 applications each having their own subscription to get copies of the message.
You then have the flexibility to define durable/nondurable subscriptons for example.
For option # 2, there are (at least) 2 solutions available:
There is an open source application called MMX (Message Multiplexer). It will do exactly what you describe. The only issue is that you will need to manage the application. i.e. If you stop the queue manager then the application will need to be manually restarted.
There is a commercial solution called MQ Message Replication. It is an API Exit that runs within the queue manager and does exactly what you want. Note: There is nothing external to manage as it runs within the queue manager.
I think there is another solution, with MQ only, to define a Namelist which will mirror queue1 to queue2 and queue3
Should be defined like: Source, Destination, QueueManager.
Hope it is useful.
Biruk.

How to store json-patch operations in redis queue and guarantee their consistency?

I have collaborative web application that handles JSON-objects like the following:
var post = {
id: 123,
title: 'Sterling Archer',
comments: [
{text: 'Comment text', tags: ['tag1', 'tag2', 'tag3']},
{text: 'Comment test', tags: ['tag2', 'tag5']}
]
};
My approach is using rfc6902 (JSONPatch) specification with jsonpatch library for patching JSON document. All such documents store in MongoDB database and as you know the last one very slow for frequent writes.
To get more speed and highload application I use redis as queue for a patch operations like the following:
{ "op": "add", "path": "/comments/2", "value": {text: 'Comment test3', tags: ['tag4']}" }
I just store all such patch operations in queue and at midnight run cron script to get all patches and construct full document and update it in MongoDB database.
I don't understand yet what should I do in case corrupted patch like:
{ "op": "add", "path": "/comments/0/tags/5", "value": 'tag4'}
The patch above don't gets applied to document above because tags array has length only 3 (according official docs https://www.rfc-editor.org/rfc/rfc6902#page-5)
The specified index MUST NOT be greater than the number of elements in the array.
So when user is online he don't get any errors because his patch operations get stored in redis queue but next day he get broken document due broken patch that don't got applied in cron script.
So my question if how could I guarantee that all patches that stored in redis queue is correct and don't corrupts primary document?
As with any system that can become inconsistent, you must allow for patches to be applied as quickly as possible if you wish to catch conflicts sooner and decrease the likelihood of running into them. That is likely your main issue if you are not notifying the other clients of any updated data as soon as possible (and are just waiting for the CRON to run to update the shared data that the other clients can access).
As others have asked, it's important to understand how a "bad" patch got into the operation queue in the first place. Here are some guesses from my standpoint:
A user had applied some operations that got lost in translation. How? I don't know, but it would explain the discrepancy.
Operations are not being applied in the correct order. How? I don't know. I have no code to go off of.
Although I have no code to go off of, I can take a shot in the dark and help you analyze the latter point. The first thing we need to analyze is the different scenarios that may come up with updating a "shared" resource. It's important to note that, in any system that must eventually be consistent, we care about the:
Order of the operations.
How we will deal with conflicts.
The latter is really up to you, and you will need a good notification/messaging system to update the "truth" that clients see.
Scenario 1
User A applies operations 1 & 2. The document is updated on the server and then User B is notified of this. User B was going to apply operations 3 & 4, but these operations (in this order) do not conflict with operations 1 & 2. All is well in the world. This is a good situation.
Scenario 2
User A applies operations 1 & 2. User B applies operations 3 & 4.
If you apply the operations atomically per user, you can get the following queues:
[1,2,3,4] [3,4,1,2]
Anywhere along the line, if there is a conflict, you must notify either User A or User B based on "who got there first" (or any other weighting semantics you wish to use). Again, how you deal with conflicts is up to you. If you have not read up on vector clocks, you should do so.
If you don't apply operations atomically per user, you can get the following queues:
[1,2,3,4] [3,4,1,2] [1,3,2,4] [3,1,4,2] [3,1,2,4] [1,3,4,2]
As you can see, forgoing atomic updates per user increases the combinations of updates and will therefore increase the likelihood of a collision happening. I urge you to ensure that operations are being added to the queue atomically per user.
A Recap
Some important things you should remember:
Make sure updates to the queue are atomically applied per user.
Figure out how you will deal with several versions of a shared resource arising from multiple mutations from different clients (again I suggest you read up on vector clocks).
Don't update a shared resource that may be accessed by several clients in real-time as a cron job.
When there is a conflict that cannot be resolved, figure out how you will deal with it.
As a result of point 3, you will need to come up with a notification system so that clients can get updated resources quickly. As a result of point 4, you may choose to include telling clients that something went wrong with their update. Something that has just come to the top of my head is that you're already using Redis, which has pub/sub capabilities.
EDIT:
It seems like Google Docs handles conflict resolutions with transformations. That is, by shifting whole characters/lines over to make way for a hybrid application of all operations: https://drive.googleblog.com/2010/09/whats-different-about-new-google-docs_22.html
As I had said before, it's all up to how you want to handle your own conflicts, which should largely be determined by the application/product itself and its use cases.
IMHO you are introducing unneeded complexity instead of simpler solution. These would be my alternate suggestions instead of your approach of a json patch cron which is very hard to make consistent and atomic.
Use mongodb only : With proper database design and indexing, and proper hdarware allocation/sharding, the write performance of mongodb is really fast. And the kind of operations you are using in jsonpatch are natively supported in mongodb BSON documents and their query language .e.g $push,$set,$inc,$pull etc.
Perhaps you want to not interrupt users activities with a syncronous write to Mongodb , for that the solution is using async queus as mentioned in point#2.
2.Use task queues & mongodb: Instead of storing patches in redis like you do now, you can push the patching task to a task queue, which will asyncronously do the mongodb update , and user will not experience any slow performance. One very good task queue is Celery , which can use Redis as a broker & messaging backend. So, each users updates get a single task, and will get applied to mongodb by the task queue, and there will be no performance hit.