Neo4j MySql Benchmark

Neo4j MySql Benchmark - mysql

I tested out the performance of both neo4j and mysql on a simple crud process, I still wonder why does it take longer time on neo4j than it is on mysql. on select process i also experience the same result, where neo4j takes quite longer time than mysql. i wonder if im not doing things properly.
-----Neo4j-----
profile match (n:User{name:"kenlz"}) set n.updated = "2016-04-18 10:00:00" using index n:User(name)
Total update time for spesific user (3 records found) : 3139 milliseconds
profile match (n:User{enabled:1}) set n.updated = "2016-04-18 10:00:00" using index n:User(name)
Total update time for any users limit 1116961 : 27563 milliseconds
-----MySql-----
update tbl_usr set updated = now() where name = 'kenlz';
Total update time for spesific user (3 records found) : 1170 milliseconds
update tbl_usr set updated = now() where enabled = 1;
Total update time for any users limit 1116961 : 5579 milliseconds

Your operations look reasonable.
But please consider that the power of a graph database like neo4j increases with the locality of data. I.e. so-called graph-traversals (E.g. visit consecutive edges and nodes on a path), which perform really bad within a relational dbms like mysql.

Related

MySQL + Grafana: slow queries on time series data

I'm new to Grafana and MySQL, .
I have an IoT device sending data every 1 second to an Amazon RDS Aurora database running with MySQL (db.t3.small). With Grafana, I added the RDS Datasource and created a Dashboard to visualize a sensor value on a chart (5 minutes of data displayed, so 300 values).
I noticed that for the first hours, the queries run fast and the dashboard refreshes in less of a second.
However, after 24 hours of data sent to the database, it takes more than 20 seconds to refresh the dashboard. I tried with a Unix timestamp type stored as an integer and with the TIMESTAMP type of MySQL, but I got the same problem. Here is my query:
SELECT
ts_timestampFormat AS "time",
LAeq
FROM testsGrafanaTable
WHERE
$__timeFilter(ts_timestampFormat ) AND
station_id = 'station-1'
ORDER BY ts_timestampFormat
Could you help me to understand why this is happening? Is this related to the query? Or to the database performance?
How I can get a faster query?
I also tried to use a time range in the WHERE statement, ts_timestampFormat >= $__timeFrom() AND ts_timestampFormat <= $__timeTo() AND, but I got the same issue.
Thanks!

How to view all queries in MemSQL or MySQL?

How can I view a list/history of all queries in a database/cluster in MemSQL or MySQL including completed and on-going running queries? I would like to see the status of any query such as if it completed or if it is running or if it has been aborted. Is there a query that I can run to view this? Thank you.

MemSQL has information_schema views to get info about running and completed/failed queries. Take a look at https://docs.memsql.com/concepts/v6.0/workload-profiling/.
For example the following query will show all the queries that have been run in the last 10 minutes:
select query_text,success_count,failure_count from information_schema.mv_activities_cumulative join information_schema.mv_queries using (activity_name) where last_finished_timestamp > now() - interval '10' minute;
You can also use these views to drill deeper to understand the resource usage of queries.

According to official memsql documentation on Management View Reference they say "mv_activities determines recent resource use by computing the change in mv_activities_cumulative over an interval of time. This interval is controlled by the value of the activities_delta_sleep_s session variable."
So something like this if I read the documentation correctly should upon execution set the variable activities_delta_sleep_s to 30 seconds and then when the query is run you will see all database activities in decreasing cpu time. My problem with it is that I don't seem to see all of the activities I perform and a lot of the query text is blank. The activities_delta_sleep_s according to memsql forums indicates that the variable simple makes all management view calls sleep for this variable amount of time before returning results. I think the idea there is that this would allow the for the aggregation of activities in nodes and network traffic.
#set the lookback period desired
set activities_delta_sleep_s = 30;
#This query is supposed to show resource costs of server activities, but not sure how to ID specific queries.
SELECT * #look at QUERY_TEXT
from information_schema.MV_ACTIVITIES a
LEFT JOIN information_schema.mv_queries q ON a.ACTIVITY_NAME = q.ACTIVITY_NAME
order by 4 DESC;

How to purge big MySQL database old entries from single column?

I need to remove old database entries payload, while keeping other data (id and other properties) of same entries.
Table in question has message_id column (which consists of a datestamp concatenated with other info), content column (which is BLOB, and it makes over 90% of database total size) and some other columns that we have no use for in this case.
I've first tried running simple update with condition:
UPDATE LOW_PRIORITY repository SET content="" WHERE SUBSTR( message_id, 6, 6 )<201601 AND message_box = "IN";
I extract a YYYYMM from every entry message_id, and if it's older than a chosen cutoff month - I replace content with an empty string.
Database is over 25GB in size, and holds almost 2KK entries in my table, and is running on a very modest hardware, and my query failed with error after running for some time:
ERROR 2013 (HY000): Lost connection to MySQL server during query
Usually I try to avoid changing database variables, but i knew this error also pops up when you try restoring database from a large dumpfile, therefore I went and updated setting to handle 100MB packet size:
set global max_allowed_packet=104857600;
Re-running my UPDATE query resulted in a new error:
ERROR 2013 (HY000): Lost connection to MySQL server during query
As I have mentioned before - my MySQL server runs on a very modest hardware, and I'd prefer not to modify settings that could make server exceed available resources, therefore instead of increasing all available timeout database variables, I've decided to run my query in smaller chunks with a query like this:
UPDATE LOW_PRIORITY repository SET content="" WHERE message_id in (select message_id from(select message_id from repository where SUBSTR( message_id, 6, 6 )<201603 AND message_box = "IN" limit 0, 1000)as temp);
This query fails with an error:
ERROR 1206 (HY000): The total number of locks exceeds the lock table size
It also fails with a same query when limited even to single line with "limit 1"!
Do I use pagination incorrectly, or is there another better way of doing this?
*DB is running an a virtual Ubuntu server with dual core Intel CPU with 1GB of RAM and 100GB HDD. I't completely adequate for it's daily tasks, and I'd really like not to increase specs for just this one query.

You are trying to trick mysql into doing something it doesn't want (using limit in an in-statement) in a complicated way (complicated = more resources). That is not wrong, but you can just write
UPDATE LOW_PRIORITY repository SET content=""
WHERE content <> ""
and SUBSTR( message_id, 6, 6 ) < 201603 AND message_box = "IN"
limit 1000;
This will update the first 1000 old rows that still have content in it.

I would imagine your #1 problem here is that your WHERE condition will not be able to use an index on message_id field.
Why not simply do:
WHERE message_id < 20160100* ...
Assuming this is integer field, 201512** would be less the 201601** anyway so there would be no change in your outcome. But removing the substring function would allow you to use index on that field.

Server Monitoring Software - designing database MySQL

I want to create a monitoring script that will check my servers. Right now I'm stuck on a problem, I need to find out a way to get uptime percentage. Basically all data is stored in MySQL server, for me the easiest way to get uptime is to create a function that will add a new record to mysql server every minute with date, time, information is it online etc. but if I will use this method and I will have for example 1000 servers to monitor, I will end up with 518 400 000 records in MySQL server per year.
Another idea was to create one record per server with two rows online and offline, but without any date and time I'm not able to get uptime...
Any ideas how to design database for monitoring system ?

The MySQL information_schema contains uptime information (expressed in seconds) for each server. I am not sure how accurate your figure has to be, but you could get this value at a set interval and compare it to the previous value. Depending on the interval, this could give you a good approximation.
SELECT Variable_Value FROM SESSION_STATUS S WHERE Variable_Name = 'UPTIME';
Also, the MySQL error log contains a date and time stamp when the server starts. Scrape this info periodically and add to your server table.

mysql update query optimization

I have a database with 200+ entries, and with a cronjob I'm updating the database every 5 minutes. All entries are unique.
My code:
for($players as $pl){
mysql_query("UPDATE rp_players SET state_one = '".$pl['s_o']."', state_two = '".$pl['s_t']."' WHERE id = '".$pl['id']."' ")
or die(mysql_error());
}
There are 200+ queries every 5 minute. I don't know what would happen if my database will have much more entries (2000... 5000+). I think the server will die.
Is there any solution (optimization or something...)?

I think you can't do much but make the cron to be executed every 10 minutes if it's getting slower and slower. Also, you can set X rule to delete X days old entries.

If id is your primary (and unique as you mentioned) key, updates should be fast and couldn't be optimised (since it's a primary key... if not, see if you can add an index).
The only problem which could occur (on my mind) is cronjob overlapping, due to slow updates: let's assume your job starts at 1:00am and isn't finished at 1:05am... this will mean that your queries will pile up, creating server load, slow response time, etc...
If this is your case, you should use rabbitmq in order to queue your update queries in order to process them in a more controlled way...

I would load all data that is to be updated into a temporary table using the LOAD DATA INFILE command: http://dev.mysql.com/doc/refman/5.5/en/load-data.html
Then, you could update everything with one query:
UPDATE FROM rp_players p
INNER JOIN tmp_players t
ON p.id = t.id
SET p.state_one = t.state_one
, p.state_two = t.state_two
;
This would be much more efficient because you would remove a lot of the back and forth to the server that you are incurring by running a separate query every time through a php loop.
Depending on where the data is coming from, you might be able to remove PHP from this process entirely.

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008

Neo4j MySql Benchmark - mysql

Your operations look reasonable. But please consider that the power of a graph database like neo4j increases with the locality of data. I.e. so-called graph-traversals (E.g. visit consecutive edges and nodes on a path), which perform really bad within a relational dbms like mysql.

Related

MySQL + Grafana: slow queries on time series data

How to view all queries in MemSQL or MySQL?

How to purge big MySQL database old entries from single column?

Server Monitoring Software - designing database MySQL

mysql update query optimization

Categories

Resources