Whats faster on mysql? Column per column handling or via index? - mysql

I'm currently working on a cloud project. Its hosted on Amazon AWS and the data is stored in RDS (MySQL). I have many devices with many small requests, devices are asking the server for new commands to execute. The devices have some parameters like "power"=1 or 0, etc., so the commands are used to give the devices order what to do. Now there are two scenarios:
Every command is a column in the table "commands", the devices are asking and the server searches for commands with device=ID. "the classic style". Gives back the column and deletes it (2 Queries).
There is a table called "parameters", where all the "power", ... status things are placed , every line has a timestamp and the device too. So every request the server says: ok, the timestamp of the device is xxx, so which parameter was updated after this xxx.
The description is a bit complicated. Sorry for that. The point is: In the first idea there are not as many columns as in the second. But in the second the server has to check every cloumn for WHERE device=ID AND timestampx > 'device_time_stamp'. Every device asks every 5 seconds and there will be a lot of devices, so its a question about performance.
Thanks folks

On the limited information available, you might have something like this:
device_command
(device_command_id -- PK establishing a sequence for the commands
,device_id
,command_id
,completed -- either a simple 1/0 flag, or a timestamp
)
I'm not sure I understand how the 'parameters' fit into all this so I'll leave it at that for now.

Related

Sync data from multiple local mysql instances to one cloud database

I am looking for a solution to sync data from multiply small instances to one big cloud instance.
I have many devices gathering data logs, every device has there own database, so I need a solution to sync data from them to one instance. The delay is not important but I want to sync the data with a max delay of 5-10 min.
Is there any ready solution for it?
Assuming all the data is independent, INSERT all the data into a single table. That table would, of course, have a device_id column to distinguish where the numbers are coming from.
What is the total number of rows per second you need to handle? If less than 1000/second, there should be no problem inserting the rows into the same table as the arrive.
Are you using HTTP? Or something else to do the INSERTs? PHP? Java?
With this, you will rarely see more than a 1 second delay between the reading being taken and the table having the value.
I recommend
PRIMARY KEY(device_id, datetime)
And the use of Summary tables rather than slogging through that big Fact table to do graphs and reports.
Provide more details if you would like further advice.

update specific column of a row in mysql after every 5 minutes unsing node.js

I am working on a project with node.js (not express) server and a mysql database. When a user clicks a button on the page, it uploads 2 values (say SpecificName, Yes/No). Now these values get inserted into the mysql database through the node server. Later, mysql runs a check for the specificName column (if it finds none, it then creates a column with the same Name) and updates the second value in it.
Now I would like to keep every update of the second value that the user makes through website (i.e yes) for 5 minutes in the mysql database after which it automatically updates the the specific location with another value (say cancel). I'm auspicious in solving every thing except this 5 minutes paradox. Also I'm keeping 15-20 so called specificName columns in which the value (say yes/no) is being updated and at the same time there are more than 1000 rows that are working simultaneously so a lots of 5 minute timers going on for the values. Is there a way to store value temporarily in mysql after which it is destroyed automatically?.
I came across :
node-crons (too complex, don't even know if its a right choice)
mysql events (I'm not sure how to use it with node)
timestamp (can't create more than one timestamp (guess I need one for each column))
datetime (haven't tested it yet) and other things like
(DELETE FROM table WHERE timestamp > DATE_SUB(NOW(), INTERVAL 5 MINUTE)).
Now I have no idea what to use or how to resolve this dilemma.
Any help would be appreciated.
Per my conversation with Sammy on kik, I'm pretty sure you don't want to do this. This doesn't sound like a use case that fits MySQL. I also worry that your MySQL knowledge is super limited, in which case, you should take the time to do more research on MySQL. Without a better understanding of the larger goal(s) your application is trying to accomplish, I can't suggest better alternatives. If you can think of a way to explain the application behavior without compromising the product idea, that would be very helpful in helping us solve your problem.
General things I want to make clear before giving you potential answers:
You should not be altering columns from your application. This is one of my issues with the Node/Mongo world. Relational databases don't like frequently changing table definitions. It's a quick way to a painful day. Doing so is fine in non-relational systems like Mongo or Cassandra, but traditional relational databases do not like this. The application should only be inserting, updating, and deleting rows. Ye hath been warned.
I'm not sure you want to put data into MySQL that has a short expiration date. You probably want some sort of caching solution like memcache or redis. Now, you can make MySQL behave like a cache, but this is not its intended use. There are better solutions. If you're persistent on using MySQL for this, I recommend investigating the MEMORY storage engine for faster reads/writes at the cost of losing data if the system suddenly shuts down.
Here are some potential solutions:
MySQL Events - Have a timestamp column and an event scheduled to run... say every minute or so. If the event finds that a row has lived more than 5 minutes, delete it.
NodeJS setTimeout - From the application, after inserting the record(s), set a timeout for 5 minutes to go and delete said records. You'll probably want to ensure you have some sort of id or timestamp column for supahfast reference of the values.
Those are the two best solutions that come to mind for me. Again, if you're comfortable revealing how your application behaves that requires an unusual solution like this, we can likely help you arrive at a better solution.
ok so guess i figured it out myself. I'm posting this answer for all those who still deal with this query.I used DATETIME stamp in mysql that i created for each column of specificName.so with every specificName column,there exist another specificName_TIME column that stores the time at which the value (yes/no) is updated.the reason i didn't use timestamp is because its not possible to create end number of timestamp in mysql versions lower than 5.6.Now i updated the current time by adding 5 minutes before storing it in database.Then i ran 2 chain functions. First one checks if the datetime in the database is smaller than the current time (SELECT specificName FROM TABLE WHERE specificName_TIME < NOW()).If it turns out to be true it shows me the value else it reflects null.Then i ran the second function to update the value if its true to continue the whole process again and if not to continue it anyways updating the last value with null.
HOPE THIS HELPS.

Mysql update table at the same time

Let's say in mysql, I want to update a column in one of the table. I need to SELECT the record and change the value, after that, UPDATE it back to the database. In some case, I couldn't do these 2 operations in one sql query and nest them into subquery (due to mysql limitation), I have to load it into program (let's say Java), change the value, and then put back into database.
For example, program A get a column's value and want to increase it with one and then put it back. At the same time, program B want to do the same thing too. Before program A put back the increased value, program B already get the wrong value (program B is supposed to get the value that is increased by program A, but it run at the same time as program A, so it retrieved the same value as A).
Now my question is , what are the good ways to handle this kind of problem?
My another question is, I believe that mysql shouldn't be a single threaded system, but let's say if there are two same queries (they are updating the same table, same column and same record) come in at the same time, how mysql handle this kind of situation? Which one mysql will schedule first and which one later?
Moreover, could anyone explain a bit how mysql work in multithreading support? One connection one thread? So all the statement created under that connection will schedule in a same queue?
If you're using InnoDB, you can use transactions to provide fine-grained mutual exclusion.
If you're using MyISAM, you can use LOCK TABLE to prevent B from accessing the table until A finishes making its changes.
If two clients try to update the same field at the same time, it's unpredictable which one will win the race. The database has internal mutual exclusion to serialize the two queries, but the specific order is essentially random.

Medium-term temporary tables - creating tables on the fly to last 15-30 days?

Context
I'm currently developing a tool for managing orders and communicating between technicians and services. The industrial context is broadcast and TV. Multiple clients expecting media files each made to their own specs imply widely varying workflows even within the restricted scope of a single client's orders.
One client can ask one day for a single SD file and the next for a full-blown HD package containing up to fourteen files... In a MySQL db I am trying to store accurate information about all the small tasks composing the workflow, in multiple forms:
DATETIME values every time a task is accomplished, for accurate tracking
paths to the newly created files in the company's file system in VARCHARs
archiving background info in TEXT values (info such as user comments, e.g. when an incident happens and prevents moving forward, they can comment about it in this feed)
Multiply that by 30 different file types and this is way too much for a single table. So I thought I'd break it up by client: one table per client so that any order only ever requires the use of that one table that doesn't manipulate more than 15 fields. Still, this a pretty rigid solution when a client has 9 different transcoding specs and that a particular order only requires one. I figure I'd need to add flags fields for each transcoding field to indicate which ones are required for that particular order.
Concept
I then had this crazy idea that maybe I could create a temporary table to last while the order is running (that can range from about 1 day to 1 month). We rarely have more than 25 orders running simultaneously so it wouldn't get too crowded.
The idea is to make a table tailored for each order, eliminating the need for flags and unnecessary forever empty fields. Once the order is complete the table would get flushed, JSON-encoded, into a TEXT or BLOB so it can be restored later if changes need made.
Do you have experience with DBMS's (MySQL in particular) struggling from such practices if it has ever existed? Does this sound like a viable option? I am happy to try (which I already started) and I am seeking advice so as to keep going or stop right here.
Thanks for your input!
Well, of course that is possible to do. However, you can not use the MySQL temporary tables for such long-term storage, you will have to use "normal" tables, and have some clean-up routine...
However, I do not see why that amount of data would be too much for a single table. If your queries start to run slow due to much data, then you should add some indexes to your database. I also think there is another con: It will be much harder to build reports later on, when you have 25 tables with the same kind of data, you will have to run 25 queries and merge the data.
I do not see the point, really. The same kinds of data should be in the same table.

MySQL structure for DBs larger than 10mm records

I am working with an application which has a 3 tables each with more than 10mm records and larger than 2GB.
Every time data is inserted there's at least one record added to each of the three tables and possibly more.
After every INSERT a script is launched which queries all these tables in order to extract data relevent to the last INSERT (let's call this the aggregation script).
What is the best way to divide the DB in smaller units and across different servers so that the load for each server is manageable?
Notes:
1. There are in excess of 10 inserts per second and hence the aggregation script is run the same number of times.
2. The aggregation script is resource intensive
3. The aggregation script has to be run on all the data in order to find which one is relevant to the last insert
4. I have not found a way of somehow dividing the DB into smaller units
5. I know very little about distributed DBs, so please use very basic terminology and provide links for further reading if possible
There are two answers to this from a database point of view.
Find a way of breaking up the database into smaller units. This is very dependent on the use of your database. This is really your best bet because it's the only way to get the database to look at less stuff at once. This is called sharding:
http://en.wikipedia.org/wiki/Shard_(database_architecture)
Have multiple "slave" databases in read only mode. These are basically copies of your database (with a little lag). For any read only queries where that lag is acceptable, they access these databases across the code in your entire site. This will take some load off of the master database you are querying. But, it will still be resource intensive on any particular query.
From a programming perspective, you already have nearly all your information (aside from ids). You could try to find some way of using that information for all your needs rather than having to requery the database after insert. You could have some process that only creates ids that you query first. Imagine you have tables A, B, C. You would have other tables that only have primary keys that are A_ids, B_ids, C_ids. Step one, get new ids from the id tables. Step two, insert into A, B, C and do whatever else you want to do at the same time.
Also, general efficiency/performance of all queries should be reviewed. Make sure you have indexes on anything you are querying. Do explain on all queries you are running to make sure they are using indexes.
This is really a midlevel/senior dba type of thing to do. Ask around your company and have them lend you a hand and teach you.