neo4j and its performance - how to use full computer power - mysql

I'm doing a project that requires converting one SQL database (MySQL) to graph database (neo4j). Database has more than 20 tables, and the biggest one has about 13.000.000 entries. I've migrated only few tables, and I have more than 15.000.000 nodes and about the same number of relations.
Loading data is very slow (I've loaded nodes very fast, but I had problems with creating relations for the nodes from the biggest table). Also, MySQL queries are running faster than neo4j ones.
When I take a look at my task manager, I can see that neo4j is not using full computer power. How can I set maximum computer usage for neo4j?
I'm using a (Windows) laptop with
Intel Core i7 4702MQ, 2.2GHz
8 GB DDR3
SSD 850 EVO, 500GB
When a query is run, neo4j uses 2-3GB of ram (there are 2gb more) and maybe 20% of the CPU power. I've tried to change settings in Java VM Tunning options (neo4j-community.vmoptions) but that didn't gave me any result.
I'm using neo4j community edition, version 3.0.

Related

MySQL Workbench do not use all the cpu cores

I have a little problem with my Workbench on Ubuntu 16.04.
I noticed while I am copying a multiple csv-row on a table (not INSERT), workbench became grey and then stack for a while, usually more the 10/15 min.
I know that there are other way to import csv etc... but the problem is that while he is copying the data in the table, just to show to myself how beautifull is gonna be the table after the insert, the CPU1 goes straigth to 100% and all the other cpu stays around 5% to 10% and then after a while it switch to another cpu and same story happened. Cause is not mysql doing the job, but the software, why the software doesn't use all 4 core to get a boost?
I have an Intel® Core™ i7-3537U CPU # 2.00GHz × 4 and 8GB RAM
No.
A single connection in MySQL will use only a single CPU core.
If you could determine what underlying SQL is involved, perhaps we could help you find a more efficient way to do the task.

AWS Architecture of CPU intensive MySQL searching and querying

I'm posed with a simple problem of scaling. As college students, we are new to setting up AWS services, droplets, scaling etc. We are stuck at deciding the architecture of our app. I'm unable to decide whether to use a big computing AMAZON EC2 or smaller multiple instances for benchmarking performance.
Even after code optimization, our MySQL queries are not as fast as we want it to be and clearly our hardware will address this problem. We are looking for high performance servers which require mostly searching a lot of MySQL FULL INDEXED search queries over 1.3M records (which clearly is a CPU and Memory intensive task). We intend to switch over to Solr at a later point of time. Both these tasks are very CPU demanding and RAM dependent. As of now, we are running our web app stack entirely on a single CPU with 2 cores and 4 GB RAM. However, we now wish to split the load up into multiple, say 5 instances/droplets of each 2 cores and 4 GB RAM.
Our primary concern is that, If we did create multiple ec2 instances/ droplets, wouldn't there be a considerable overhead for communicating between the instances/droplets for a standard MySQL search. As far as I know, the MySQL connection uses sockets to connect to local/remote host. Being a remote communication between 4 servers, I would expect significant overhead for EACH query.
For instance, let's say I've setup 4 instances and I've been allocated these IP's for each of them.
Server 1- x.x.x.1
Server 2- x.x.x.2
Server 3 - x.x.x.3
Server 4 - x.x.x.4
I setup a MySQL server and dump my databases into each of these instances (sounds like a very bad idea). Now I make a MySQL connection using python as:
conn1 = MySQLdb.connect(host=server1,user=user,passwd=password,db=db)
conn2 = MySQLdb.connect(host=server2,user=user,passwd=password,db=db)
conn3 = MySQLdb.connect(host=server3,user=user,passwd=password,db=db)`
conn4 = MySQLdb.connect(host=server4,user=user,passwd=password,db=db)
Since, each of these databases arn't on the localhost, I would guess that there is a huge overhead involved in contacting the server and getting the data for each query.
My Thoughts:
I'm guessing there must be a solution for integrating different droplets/ instances together. Unfortunately, I haven't found any resources to support that claim.
I've looked into Amazon RDS, which seems like a good fit. But again, I wouldn't be able to benchmark against a 4 instances MySQL search or a single huge AWS RDS server (given, it is quite expensive for new apps.)
We are also unsure of replacement of python with popular languages for scaling such as Scala which will help me tackle this problem of dealing with multiple servers.
Any suggestions will be greatly appreciated by our 3 member team :)

Run MySQL in RAM

I have a moderately large database (~75 million rows, ~300 GB), and a fairly powerful machine (amazon r3.8xlarge, 32 CPUs, 244 GB RAM). Unfortunately, MySQL doesn't seem to be using all this power. It barely touches a tenth of the RAM.
My application of this database is a python program which reads the tables, crunches numbers, and stores results back in the database. There's only one user, so no table locks are needed.
Ideally, nearly the entire database would be stored in RAM. I considered mounting a straight RAM filesystem, but this means having to offload everything back to a hard drive myself. I set the key_buffer_size and innodb_buffer_pool_size to 100G each, but it doesn't seem to have any effect.
Is there something obvious I'm missing?

Fragmented SQL Table on Disk

I have an instance of MySQL Server 5.6.20 running on Windows Server 2012. One table in particular in my database is very large (23 GB on disk, 31 million rows). When I query this table, even for simple operations such as a count(*), the performance is terrible, frequently taking as long as 40 minutes to complete.
Checking the resource monitor, I see my Highest Active Time pinned at 100%, but only reading 1.5-2.0 MB per second from the disk (much below the peak performance of the drive). Internet research suggests this happens when reading highly fragmented files from disk, but the only file being read is the MySQL InnoDB database file. Am I interpreting this right that the data file itself is heavily fragmented? Is there an SQL specific solution or is windows defrag the correct approach to this problem?
EDIT
There are two Dell PERC H310 SCSI 1.8 TB Disks in the machine. Only one is formatted. RAID was never setup. No SSDs are installed.

MySQL Cluster Node Specific Hardware

I am looking at setting up a MySQL Cluster with two high end dell servers (duel opteron 4386 cpus, 16gb ram and RAID 10 SAS). I also have a handful of other high end machines that are i7 with 20gb+ ram each on average.
What is the hardware focus for each node in MySQL Cluster?
I know that the management node requires very little as it is simply an interface to control the cluster and can be put on the same machine as mysqld.
What hardware is the most important for the MySQL node and for the data nodes (hard disk IO, ram, cpu etc?
You're correct that the management node needs very little resource - just make sure you run it/them on different machines to the data nodes.
Data Nodes like lots of memory as by default all of the data is held in RAM; they can also make good use of fast cores and in more recent releases lots of them (perhaps upto about 48). Multiple spindles for redo/undo logs,checkpoints, disk table files etc. can speed things up.
MySQL Servers don't need a lot of RAM as they don't store the Cluster data and in most cases you wouldn't use the query cache. If replicating to a second Cluster then you should make sure that they have enough disk space for the binary log files.