Suppose I receive a big csv file with lots of data in it, and the loopback server must parse all this data after the file is loaded, run some processes in it (Ex. Create user accounts and do some other registrations related to the account, or just create a database entry for each row in the file) and say this file has possibly from 10,000 to 3'000,000 entries (I'm using MySQL btw, maybe there is a better option for that too), it takes a lot of time to process all that, is there a "neat" way to handle that? right now what I'm doing is, after I get the file, I return the response to the user in my remote method callback(null,{message:'got file, server still working'}); and continue to process in the background (in the same remote method code line, I just don't callback after done because I already did) and then I run a 500ms timer interval in the front-end to request the process status in a different endpoint (I save the progress percentage in a field's row on the database, for this endpoint to request), is this the way to do this? or is there a better option? I already run mysql queries in groups of 10,000 each commit and I disable foreign key checking too (I'm using mysql connector query execution directly). Thanks in advance :)
Related
So, i have to read a excel file in which each row contains some data that i want do write in my database. I pass the whole file to laravel, it reads the file and format it to a array and then i make a new insertion (or update) in my databse.
The thing is, the input excel file can contain thousands of rows and its taking a while to complete, giving a timeout error in some cases.
When i try to make this locally i use set_time_limit(0); function so timeout doesnt occur, and it works pretty wel. But in a remote server this function is disabled for security reasons and my code crashes because of a timeout.
Somebody can help in how to solve this problem ? Maybe another ideia in how to better solve this problem ?
A nice way to handle tasks that take a long time is by making use of so called jobs.
You can make a job called ImportExcel and dispatch it when someone send you a file.
Take a good look at the docs, they have some great examples on how to do this.
You can take care of this using following steps :
1. Take the csv file and store it temporarily in storage :
You can store the large csv when user uploads. If it's something which is not uploaded from frontend, just make sure you have it saved to be processed in next step.
2. Then dispatch a job which can be queued :
You can create a job which can handle this asynchronously. You can use Supervisor to manage queues and timeouts etc.
3. Use package like thephpleague :
Using this package(or similar), you can chunk the records or read one at a time. It is really really helpful to keep your memory usage under limit. Plus it has different options of methods available to read the data from files.
4. Once file is processed, you can delete it from the temporary storage :
Just some teardown cleanup activity.
I have an Android frontend.
The Android client makes a request to my NodeJS backend server and waits for a reply.
The NodeJS reads a value in a MySQL database record (without send it back to the client) and waits that its value changes (an other Android client changes it with a different request in less than 20 seconds), then when it happens the NodeJS server replies to client with that new value.
Now, my approach was to create a MySQL trigger and when there is an update in that table it notifies the NodeJS server, but I don't know how to do it.
I thought two easiers ways with busy waiting for give you an idea:
the client sends a request every 100ms and the server replies with the SELECT of that value, then when the client gets a different reply it means that the value changed;
the client sends a request and the server every 100ms makes a SELECT query until it gets a different value, then it replies with value to the client.
Both are bruteforce approach, I would like to don't use them for obvious reasons. Any idea?
Thank you.
Welcome to StackOverflow. Your question is very broad and I don't think I can give you a very detailed answer here. However, I think I can give you some hints and ideas that may help you along the road.
Mysql has no internal way to running external commands as a trigger action. To my knowledge there exists a workaround in form of external plugin (UDF) that allowes mysql to do what you want. See Invoking a PHP script from a MySQL trigger and https://patternbuffer.wordpress.com/2012/09/14/triggering-shell-script-from-mysql/
However, I think going this route is a sign of using the wrong architecture or wrong design patterns for what you want to achieve.
First idea that pops into my mind is this: Would it not be possible to introduce some sort of messaging from the second nodjs request (the one that changes the DB) to the first one (the one that needs an update when the DB value changes)? That way the the first nodejs "process" only need to query the DB upon real changes when it receives a message.
Another question would be, if you actually need to use mysql, or if some other datastore might be better suited. Redis comes to my mind, since with redis you could implement the messaging to the nodejs at the same time...
In general polling is not always the wrong choice. Especially for high load environments where you expect in each poll to collect some data. Polling makes impossible to overload the processing capacity for the data retrieving side, since this process controls the maximum throughput. With pushing you give that control to the pushing side and if there is many such pushing sides, control is hard to achieve.
If I was you I would look into redis and learn how elegantly its publish/subscribe mechanism can be used as messaging system in your context. See https://redis.io/topics/pubsub
I'm atm working to create a huge mySQL database by parsing XML files released on a FTP.
On a single computer, it takes ages, because of the huge amount of SQL INSERT INTO to make.
Thus, I modified my code to build it on AWS by creating a cluster, launching a database, build everything and download back the dump.
However, I got a question. Is there a "queue" for SQL requests sent ? I mean, if every of my nodes are sending requests at the same time to the database, what's going to happen ?
Thanks
On MySQL you can use SHOW FULL PROCESSLIST to see the open connections and what query they are running at the moment.
There is no queue of requests but some requests waits for others to complete, before starting, because they attempt to use rows or tables that are locked by the requests that are currently running.
Only one request is executed at a time for each connection.
I'm using socket.io to send data to client from my database. But my code send data to client every second even the data is the same. How can I send data only when field in db was changed not every 1 second.
Here is my code: http://pastebin.com/kiTNHgnu
With MySQL there is no easy/simple way to get notified of changes. A couple of options include:
If you have access to the server the database is running on, you could stream MySQL's binlog through some sort of parser and then check for events that modify the desired table/column. There are already such binlog parsers on npm if you want to go this route.
Use a MySQL trigger to call out to a UDF (user-defined function). This is a little tricky because there aren't a whole lot of these for your particular need. There are some that could be useful however, such as mysql2redis which pushes to a Redis queue which would work if you already have Redis installed somewhere. There is a STOMP UDF for various queue implementations that support that wire format. There are also other UDFs such as log_error which writes to a file and sys_exec which executes arbitrary commands on the server (obviously dangerous so it should be an absolute last resort). If none of these work for you, you may have to write your own UDF which does take quite some time (speaking from experience) if you're not already familiar with the UDF C interface.
I should also note that UDFs could introduce delays in triggered queries depending on how long the UDF takes to execute.
I am currently importing a huge CSV file from my iPhone to a rails server. In this case, the server will parse the data and then start inserting rows of data into the database. The CSV file is fairly large and would take a lot time for the operation to end.
Since I am doing this asynchronously, my iPhone is then able to go to other views and do other stuff.
However, when it requests another query in another table.. this will HANG because the first operation is still trying to insert the CSV's information into the database.
Is there a way to resolve this type of issue?
As long as the phone doesn't care when the database insert is complete, you might want to try storing the CSV file in a tmp directory on your server and then have a script write from that file to the database. Or simply store it in memory. That way, once the phone has posted the CSV file, it can move on to other things while the script handles the database inserts asynchronously. And yes, #Barmar is right about using an InnoDB engine rather than MyISAM (which may be default in some configurations).
Or, you might want to consider enabling "low-priority updates" which will delay write calls until all pending read calls have finished. See this article about MySQL table locking. (I'm not sure what exactly you say is hanging: the update, or reads while performing the update…)
Regardless, if you are posting the data asynchronously from your phone (i.e., not from the UI thread), it shouldn't be an issue as long as you don't try to use more than the maximum number of concurrent HTTP connections.