SSE and sqlalchemy events - sqlalchemy

In my task, it was necessary to display real-time charts for many clients.
At the same time, the data for the charts is added in real time to the database and I need to take them from there.
To notify clients that new data has appeared, I used SSE.
I wrote a stream function, inside of which there is an event handler sql insert.
#api_router.get('/stream')
async def message_stream(request: Request):
async def event_generator():
data_queue = queue.Queue()
def after_insert_listener(mapper, connection, target):
recipe_queue.put(target.data)
event.listen(Data, 'after_insert', after_insert_listener)
while True:
# If client closes connection, stop sending events
if await request.is_disconnected():
break
# Checks for new messages and return them to client if any
if not data_queue.empty():
yield {
"data": data_queue.get()
}
return EventSourceResponse(event_generator())
Tell me, how correct is this solution?

Related

How do I serialize transactions with multiple database queries using Sequelize?

My server keeps track of game instances. If there are no ongoing games when a user hits a certain endpoint, the server creates a new one. If the endpoint is hit twice at the same time, I want to make sure only one new game is created. I'm attempting to do this via Sequelize's transactions:
const t = await sequelize.transaction({
isolationLevel: Sequelize
.Transaction
.ISOLATION_LEVELS
.SERIALIZABLE,
});
let game = await Game.findOne({
status: {[Op.ne]: "COMPLETED"},
transaction: t,
});
if(game) {
// ...
} else {
game = await Game.create({}, {
transaction: t,
});
// ...
}
await t.commit();
Unfortunately, when this endpoint is hit twice at the same time, I get the following error: SequelizeDatabaseError: Deadlock found when trying to get lock; try restarting transaction.
I looked at possible solutions here and here, and I understand why my code throws the error, but I don't understand how to accomplish what I'm trying to do (or whether transactions are the correct tool to accomplish it). Any direction would be appreciated!

Determining the API is RPC or REST

I recently designed a REST API using flask for a sample project. The front end was based on React.JS. But i got a feedback from a colleague that the API is not REST API and its RPC.
The API basically accepts 3 parameters, 2 numbers and a operation ('add','sub','mul','div'). on an end point http://127.0.0.1:5000/calculator
The input JSON will look like:
{"value1":"7.1","value2":"8","operator":"mul"}
from flask import Flask, jsonify, request, abort
from flask_cors import CORS
APP = Flask(__name__, static_url_path='')
CORS(APP) # For cross origin resource sharing
APP.config['CORS_HEADERS'] = 'Content-Type'
#APP.route('/calculator', methods=['POST'])
def calculator_operation():
if not request.json:
abort(400)
try:
val1 = float(request.json['value1'])
val2 = float(request.json['value2'])
operator = request.json['operator']
if operator == 'add':
result = val1 + vla2
elif operator == 'mul':
result = val1 * val2
elif operator == 'sub':
result = val1 - val2
elif operator == 'div' and val2 == 0:
result = 'Cant divide by 0'
elif operator == 'div':
result = round((val1 / val2), 2)
return (jsonify({'result': result}), 200)
except KeyError:
abort(400)
if __name__ == '__main__':
APP.run(debug=True)
The code works fine. I would like to know if this is REST or RPC based on the end points and the operation being performed.
EDIT:
Ajax Call
$.ajax({
type: "POST",
url: "http://127.0.0.1:5000/calculator",
data: JSON.stringify({
value1: arg1,
value2: arg2,
operator: this.state.operation
}),
contentType: "application/json",
dataType: "json",
success:( data ) => {
this.setState({ result: data.result, argumentStr: data.result });
},
error: (err) => {
console.log(err);
}
});
I would like to know if this is REST or RPC based on the end points and the operation being performed.
How does the client discover what the endpoint is, and what the input json looks like?
On the web, there would be a standard media type that describes forms; the representation of the form would include keys and values, a target URI, and an HTTP method to use. The processing rules would describe how to take the details of the form, and the values provided by the consumer, and from them construct an HTTP request.
That's REST: doing what we do on the web.
Another REST approach would be to define a link relation, perhaps "http://example.org/calculator", and a media type application/prs.calculator+json, and then document that in your context the "http://example.org/calculator" link relation indicates that the target URI responds to POST messages with payload application/prs.calculator+json. This is essentially what Atom Syndication and Atom Pub.
That's also REST.
Fielding made an interesting comment about an API he had designed
I should also note that the above is not yet fully RESTful, at least how I use the term. All I have done is described the service interfaces, which is no more than any RPC. In order to make it RESTful, I would need to add hypertext to introduce and define the service, describe how to perform the mapping using forms and/or link templates, and provide code to combine the visualizations in useful ways.
That said, if you are performing GET-with-a-payload, a semantically safe request with a body, then you are probably trapped in RPC thinking. Notice that on the web, parameterized reads are done by communicating to the client how to modify the target-uri (for instance, by appending a query string with data encoded according to standardized processing rules).
REST stands for REpresentational State Transfer. Your operation is stateless, therefore there is no state to transfer. Your operation does, however, accept arguments and return a result in the manner of a procedure or function, and it is remote, so Remote Procedure Call would be a good description of what's going on. You are, after all, providing a language-independent way for clients to call your calculator_operation procedure.
What's conspicuously missing from your model is server-side data. In general, a REST API provides a way to interact with server-side objects: to query, create, replace, update, or delete. There's no data kept on your server side: it's just an oracle which answers questions. You have the "query" aspect and nothing else.

Apache Nifi - When utilizing SplitText on large files, how can I make the put files write out immediately

I am reading in text files with 50k rows of data where each row represents a complete record.
Our Nifi flow is utilizing the SplitText to handle the file in batches of 1000 rows. (This was setup before my time for memory issues I'm told)
Is it possible to have the PutFile execute immediately? I want the files to just right out the PutFile record once it is done and not just sit in queue waiting for all 50k+ rows of data have been processed. Seems rather dumb to do that if it is being split up.
I was reading up on documentation but I cannot find if this is by design and not configurable.
Appreciate any documentation guidance that can help answer/configure my flow.
TL;DR A workaround is to use multiple SplitTexts, the first one splitting into 10k rows for example, then the second to split into 1000 rows. Then the first 10k rows will be split into 10 flow files and sent downstream while the second 10k rows are being processed by the second SplitText.
EDIT: Adding another workaround, a Groovy script to be used in InvokeScriptedProcessor:
class GroovyProcessor implements Processor {
def REL_SUCCESS = new Relationship.Builder().name("success").description('FlowFiles that were successfully processed are routed here').build()
def REL_FAILURE = new Relationship.Builder().name("failure").description('FlowFiles that were not successfully processed are routed here').build()
def REL_ORIGINAL = new Relationship.Builder().name("original").description('After processing, the original incoming FlowFiles are routed here').build()
def ComponentLog log
void initialize(ProcessorInitializationContext context) { log = context.logger }
Set<Relationship> getRelationships() { return [REL_FAILURE, REL_SUCCESS, REL_ORIGINAL] as Set }
Collection<ValidationResult> validate(ValidationContext context) { null }
PropertyDescriptor getPropertyDescriptor(String name) { null }
void onPropertyModified(PropertyDescriptor descriptor, String oldValue, String newValue) { }
List<PropertyDescriptor> getPropertyDescriptors() { null }
String getIdentifier() { null }
void onTrigger(ProcessContext context, ProcessSessionFactory sessionFactory) throws ProcessException {
def session1 = sessionFactory.createSession()
def session2 = sessionFactory.createSession()
try {
def inFlowFile = session1.get()
if(!inFlowFile) return
def inputStream = session1.read(inFlowFile)
inputStream.eachLine { line ->
def outFlowFile = session2.create()
outFlowFile = session2.write(outFlowFile, {outputStream ->
outputStream.write(line.bytes)
} as OutputStreamCallback)
session2.transfer(outFlowFile, REL_SUCCESS)
session2.commit()
}
inputStream.close()
session1.transfer(inFlowFile, REL_ORIGINAL)
session1.commit()
} catch (final Throwable t) {
log.error('{} failed to process due to {}; rolling back session', [this, t] as Object[])
session2.rollback(true)
session1.rollback(true)
throw t
}}}
processor = new GroovyProcessor()
For completeness:
The Split processors were designed to support the Split/Merge pattern, and in order to merge them back together later, they each need the same "parent ID" as well as the count.
If you send flow files out before you've split everything up, you won't know the total count and won't be able to merge them back later. Also if something goes wrong with split processing, you may want to "rollback" the operation instead of having some flow files already downstream, and the rest of them sent to failure
In order to send out some flow files before all processing, you have to "commit the process session". This prevents you from doing the things above, and it creates a break in the provenance for the incoming flow file, as you have to commit/transfer that file in the session that originally takes it in. All following commits will need new flow files created, which breaks the provenance/lineage chain.
Although there is an open Jira for this (NIFI-2878), there has been some dissent on the mailing lists and pull requests about adding this feature to processors that accept input (i.e. non-source processors). NiFi's framework is fairly transactional, and this kind of feature flies in the face of that.

How to send and receive large numpy arrays (several GBs) using flask

I am creating a micro-service to be used locally. From some input I am generating one large matrix each time. Right now I am using json to transfer the data but it is really slow and became the bottleneck of my application.
Here is my client side:
headers={'Content-Type': 'application/json'}
data = {'model': 'model_4', \
'input': "this is my input."}
r = requests.post("http://10.0.1.6:3000/api/getFeatureMatrix", headers=headers, data=json.dumps(data))
answer = json.loads(r.text)
My server is something like:
app = Flask(__name__, static_url_path='', static_folder='public')
#app.route('/api/getFeatureMatrix', methods = ['POST'])
def get_feature_matrix():
arguments = request.get_json()
#processing ... generating matrix
return jsonify(matrix=matrix.tolist())
How can I send large matrices ?
In the end I ended up using
np.save(matrix_path, mat)
return send_file(matrix_path+'.npy')
On the client side I save the matrix before loading it.
I suppose that the problem is that the matrix takes time to generate. It's a CPU bound application
One solution would be to handle the request asynchronously. Meaning that:
The server receives request and returns a 202 ACCEPTED and the link to where the client can check the progress of the creation of the matrix
The client checks the returned url he either gets:
a 200 OK response if the matrix is not yet created
a 201 CREATED response if the matrix is finally created, with a link to the resource
However, Flask handles one request at a time. So you'll need to use multithreading or multiprocessing or greenthreads.
On the client side you could do something like:
with open('binariy.file', 'rb') as f:
file = f.read()
response = requests.post('/endpoint', data=file)
and on the Server side:
import numpy as np
...
#app.route('/endpoint', methods=['POST'])
def endpoint():
filestr = request.data
file = np.fromstring(filestr)

How to replace sagas during run time?

I have a React Native app that may connect to different API endpoints. Some users may need to change API endpoint on the run time, without restarting the app. All the API requests are bound to sagas, and the root saga looks like
export default function* rootSaga() {
yield [
takeLatest([
ONE_REQUEST,
ANOTHER_REQUEST,
// a bunch of sagas that are responsible for API querying
], api); // <- here, api is global.
];
}
so it can run along with the newly instantiated Redux store:
import rootSaga from './sagas';
const sagaMiddleware = createSagaMiddleware();
const store = createStore(rootReducer, applyMiddleware(sagaMiddleware));
// other stuff, and finally
sagaMiddleware.run(rootSaga).done.catch(console.error);
The problem is, once executed, the Store, let alone sagas, can never be updated.
I tried to pass api to root saga as the first argument:
export default function* rootSaga(baseUrl = DEFAULT_API_URL) {
const api = create({
baseUrl,
// other stuff that is required by apisauce
});
yield [
takeLatest([
ONE_REQUEST,
ANOTHER_REQUEST,
// a bunch of sagas that are responsible for API querying
], api); // <- here, api is instantiated per every yield* of rootSaga.
];
}
I tried to refer to the generator itself from within a function executed per a certain action type:
yield [
takeLatest([
ONE_REQUEST,
ANOTHER_REQUEST,
// a bunch of sagas that are responsible for API querying
], api); // <- here, api is instantiated per every yield* of rootSaga.
takeEvery([
REPLACE_API // <- the action I would dispatch to replace API endpoint
], ({endpoint}) => {
yield cancel(rootSaga);
yield* rootSaga(endpoint); // <- the new API endpoint
});
];
But it didn't work. I also tried a bunch of other tactics, but none really worked either. And I looked up the docs for something similar to Redux's replaceReducer, but there's nothing like this for redux-saga, which makes me feel that it can be done using only proper sequence yielded by root saga generator.
So, is there a general approach to this problem? Is it possible to re-instantiate root saga on the run time?
It seems that you can add the endpoint URL to the state tree and manage updating the URL based on a typical redux-saga flow. Then, when you dispatch your REQUEST actions, just read the current end point URL from the state tree and attach that as a payload to the REQUEST action. Then, in your api saga, use that URL payload.