Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 7 years ago.
Improve this question
If i had to develop a
Core Java application which processes CSV files and stored output in a Open-source DB
Data size would be 10 GB initially (porting from existing sources)
Would grow at 1 GB per month
A typical transaction could fetch 100,000 rows
Could be accessed by 1000 users at a given time
And had choice of
Mongodb
MySQL
PostGresql
which would be the best choice of DB ?
This compares MongoDB with MySQL
This compares PostgreSQL to MySQL
Security alerts for MongoDB
With increasing data it's better to have a DB that scale easly and SQL doesn't scale smoothly and eventually breaks doing it, in fact usually for Big Data only High scalable DB are used.
But you said that entries can have correlation with each other so in this case it's better to use a relational DB because the NO-SQL ones can "lose" some correlation.
Like #Craig Ringer said don't consider only those DBs there are a lot of different solutions who has their own pros and cons ( for example redis is very very fast but it's almost without any kind of complex logic because it's a simple Key-Value storage, or Cassandra is faster than Mongo but works better with schemed data, Mongo is a documental DB so can store any kind of data in the same Collection ).
IMHO you should try to set up some bench marking sessions with different DB and Use case and focus on what you want to be done fast and then choose the better in that field.
Related
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 1 year ago.
Improve this question
I want to periodically insert data from an MySQL database into clickHouse, i.e., when data is added/updated in mySQL database, I want that data to be added automatically to clickHouse.
I am thinking of using the Change Data Capture (CDC). CDC is a technique that captures changes made to data in MySQL and applies it to the destination ClickHouse table. It only imports changed data, not the entire database. To use the CDC method with a MySQL database, we must utilize the Binary Change Log (binlog). Binlog allows us to capture change data as a stream, enabling near real-time replication.
Binlog not only captures data changes (INSERT, UPDATE, DELETE) but also table schema changes such as ADD/DROP COLUMN. It also ensures that rows deleted from MySQL are also deleted in ClickHouse.
After having the changes, How can I insert it in the ClickHouse?
[experimental] MaterializedMySQL
Creates ClickHouse database with all the tables existing in MySQL, and all the data in those tables.
ClickHouse server works as MySQL replica. It reads binlog and performs DDL and DML queries.
https://clickhouse.tech/docs/en/engines/database-engines/materialized-mysql/
https://altinity.com/blog/2018/6/30/realtime-mysql-clickhouse-replication-in-practice
https://clickhouse.tech/docs/en/sql-reference/dictionaries/external-dictionaries/external-dicts-dict-sources/#dicts-external_dicts_dict_sources-mysql
https://altinity.com/blog/dictionaries-explained
https://altinity.com/blog/2020/5/19/clickhouse-dictionaries-reloaded
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 3 years ago.
Improve this question
I have about 10,000 data in an old MySQL database written in PHP. This old database has no structure and relationships defined. It's completely legacy design. I'm now working to refactor the entire system of which the tables and their relationships have been completely defined now.
The issue now remains how best to move the data from the old database (written with PHP without framework) to the new (written in Laravel).
Will Laravel commands be a good option where I read data from the old specifying what column is needed and then inserting into the new database?
From the top of my head the following comes to mind:
1. Plain raw SQL
You could write a series of raw sql statements which will read the old database and insert records in the new database. This can be done without the help of an ORM like eloquent.
Advantages:
Nothing beats raw SQL in performance, so the migration will run fast
Disadvantages:
If the database structure is very different it might be hard to write the correct queries
It's easier to forget things like adding primary and foreign keys
2. Laravel commands
You could write one (or multiple) artisan commands which perform the data migration (in steps). This way you can use the DB facade in Laravel to read the old database and use Eloquent to write the data to the new database.
Advantages:
Easier to write as you can leverage eloquent models
Eloquent takes care of things you otherwise might forget like adding primary and foreign keys
Disadvantages:
Raw SQL will probably out-preform the usage of Eloquent.
If you have large amounts of data you'll have to optimize your scripts for memory usage. Otherwise you might run into memory limit issues.
So Laravel commands could surely be a good solution depending on how different your data structures are, how large your datasets are and how important performance is.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 years ago.
Improve this question
What are the differences between MySQL and Oracle databases. I know both are RDBMS, both use SQL as Query language and both are developed by Oracle. So what are the differences between these two technically?
I used Oracle in Deuth Bank for 1,5 years, and some experience with MySQL on other job.
In general, Oracle is much more powerful and is a deeper RDBMS, which allows you to write any complex system. That's why it is used in banking, military, and science fields.
MySQL - is light, simple RDBMS, it is very well for web, for example small internet shop, your personal web page, or page of a school. More complex web often use RDBMS PostgreSQL.
Oracle allows you to use packages (often on PL/SQL), coursurs (same as subselect), PL/SQL language, Roles, snapshot, synonym, tablespace.
Also Oracle has more advanced data types, and a bit different datatypes.
For example:
BIGINT (8 Bytes) In MySQL, in Oracle called - NUMBER (19,0).
For what I miss in Oracle is select * from dual, wherein dual is a default virtual table in Oracle.
For more deep comparison, please check compare table on Oracle's website:
https://docs.oracle.com/cd/E12151_01/doc.150/e12155/oracle_mysql_compared.htm#i1027526
Mysql and Oracle are both RDMS. oracle not develop MySQL he purchase it.
Both are same just syntax diffrence like
for limit rows in mysql
select * from tbl limit 1
in oracle
SELECT * FROM tbl WHERE ROWNUM <=1;
mysql is open source and oracle is paid.for more diffrence in query you can see here
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 years ago.
Improve this question
I'm working on a database that has one table with 21 million records. Data is loaded once when the database is created and there are no more insert, update or delete operations. A web application accesses the database to make select statements.
It currently takes 25 second per request for the server to receive a response. However if multiple clients are making simultaneous requests the response time increases significantly. Is there a way of speeding this process up ?
I'm using MyISAM instead of InnoDB with fixed max rows and have indexed based on the searched field.
If no data is being updated/inserted/deleted, then this might be case where you want to tell the database not to lock the table while you are reading it.
For MYSQL this seems to be something along the lines of:
SET SESSION TRANSACTION ISOLATION LEVEL READ UNCOMMITTED ;
SELECT * FROM TABLE_NAME ;
SET SESSION TRANSACTION ISOLATION LEVEL REPEATABLE READ ;
(ref: http://itecsoftware.com/with-nolock-table-hint-equivalent-for-mysql)
More reading in the docs, if it helps:
https://dev.mysql.com/doc/refman/5.7/en/innodb-transaction-isolation-levels.html
The TSQL equivalent, which may help if you need to google further, is
SELECT * FROM TABLE WITH (nolock)
This may improve performance. As noted in other comments some good indexing may help, and maybe breaking the table out further (if possible) to spread things around so you aren't accessing all the data if you don't need it.
As a note; locking a table prevents other people changing data while you are using it. Not locking a table that is has a lot of inserts/deletes/updates may cause your selects to return multiple rows of the same data (as it gets moved around on the harddrive), rows with missing columns and so forth.
Since you've got one table you are selecting against your requests are all taking turns locking and unlocking the table. If you aren't doing updates, inserts or deletes then your data shouldn't change, so you should be ok to forgo the locks.
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions concerning problems with code you've written must describe the specific problem — and include valid code to reproduce it — in the question itself. See SSCCE.org for guidance.
Closed 9 years ago.
Improve this question
Lately i have been tasked to delete and reinsert approximately 15 million rows on a myisam table that has about 150 million rows doing so while the table/db still remains available
for inserts/reads.
In order to do so i have started a process that takes small chunks of data
and reinserts it via insert select statements into a cloned table with the same structure with sleep in between the runs to not overload the server, skips over the data to be deleted and insert the replacement data.
This way while cloned table was in the build process (took 8+ hours) new data was coming in into the source table. At the end i had to just sync the tables with the new data that was
added in the 8+ hours and do a rename of the tables.
Everything was fine with exception of one thing. The cardinality
of the indexes on the cloned table is way off, and execution plans for queries executed against it are awful (went from few seconds to 30+ min for some of them).
I know that this can be fixed by running an Analyze table on it, but this takes also a lot of time (currently i'm running one on a slave server and is been executed for more then 10h now) and i can't afford to have this table offline to write while the analyze is performed. Also this will stress the IO of the server putting pressure on the server and slowing it down.
Can someone explain why building a myisam table via insert select statements results in a table which has such a poor internal statistics for indexes?
Also is there a way to incrementally build the table and have the indexes in good shape at the end?
Thanks in advance.