I'm loading a RDB with dummy data to practice query optimization. MySQL Workbench executed 10,000 INSERTs without returning an error into my customers table. Yet, when I SELECT * from that table I am only getting back exactly 1000 records in the result set. I am using InnoDB as my table engine
According to this link I should have unlimited records available and a 64TB overall sizelimit.. Im inserting 10,000 records with 4 VARCHAR(255)columns and 2 BOOLEAN columns each and I don't think that tops 1 TB. Am I wrong in this assumption?
Is the result grid limited to 1000 records? Is there an alternative to InnoDB which supports foreign keys? Is the problem that VARCHAR(255) is way too large and I need to reduce to something like VARCHAR(50)? What am I not understanding.
THANK YOU IN ADVANCE
In the query editor toolbar there's a drop down where you can limit the number of records you want to have returned. Default is 1000, but you can change that in a wide range, including no limitation.
No, it is not limited to 1000 records. I have InnoDB complex tables of more than 50 million records with blobs and multiple indexes. InnoDB is perfectly fine, you don't have to look for another engine. Could you be more precise about the context where you executed the query? Was it from a programming language? command line mysql client? Another Mysql client?
Many database query tools limit the number of rows returned. Try selecting some data from a high row number to see if your data is there (it should be).
I thought this would be useful for future reference:
In Microsoft SQL Server Management Studio, under Tools->Options->SQL Server Object Explorer->Value for Select Top <n> Rows Command change the number of rows returned:
Related
I was migrating a database from a server to the AWS cloud, and decided to double check the success of the migration by comparing the number of entries in the tables of the old database and the new one.
I first noticed that of the 46 tables I migrated, 13 were different sizes, on further inspection I noticed that 9 of the 13 tables were actually bigger in the newer database than the old one. There are no scripts/code currently setup with either database that would change the data, let alone the amount of data.
I then further inspected one of the smaller tables (only 43 rows) in the old database and noticed that when running the below sql query, I was getting a return of 40 TABLE_ROWS, instead of the actual 43. The same was the case for another smaller table in the old database where the query said 8 rows, but there were 15. (I manually counted multiple times to confirm these two cases)
However, when I ran the same below query on the new, migrated, database as I did on the old database, it was displaying the correct number of rows for those two tables.
SELECT TABLE_ROWS, TABLE_NAME FROM INFORMATION_SCHEMA.TABLES WHERE TABLE.SCHEMA = 'db_name';
Any thoughts?
Reading the documentation: https://dev.mysql.com/doc/refman/8.0/en/tables-table.html
TABLE_ROWS
The number of rows. Some storage engines, such as MyISAM, store the exact count. For other storage engines, such as InnoDB, this value is an approximation, and may vary from the actual value by as much as 40% to 50%. In such cases, use SELECT COUNT(*) to obtain an accurate count.
Were there any error/warning in the migration log? There are so many ways to migrate mysql table data, I personally like to use mysqldump and importing the resuting sql file using mysql command line client. In my experience importing using GUI clients always have some shortcomings.
In order for information_schema to not be painfully slow when retrieving this for large tables, it uses estimates, based on the cardinality of the primary key, for InnoDB tables. Otherwise it would end up having to do SELECT COUNT(*) FROM table_name, which for a table with billions of rows could take hours.
Look at SHOW INDEX FROM table_name and you will see that the number reported in information_schema is the same as the cardinality of the PK.
Running ANALYZE TABLE table_name will update the statistics which may make them more accurate, but it will still be an estimate rather than just-in-time checked row-count.
I am running MariaDB on a vServer (8 CPU vCores, 32 GB RAM) with a few dozen database tables which aggregate data from external services around the web for efficient use across my collection of websites (basically an API layer with database caching and it's own API for easy use in all of my projects).
All but one of these database tables allow quick, basic queries such as
SELECT id, content FROM tablename WHERE date_added > somedate
(with "content" being some JSON data). I am using InnoDB as the storage engine to allow inserts without table locking, "id" is always the primary key in any table and most of these tables only have a few thousand or maybe a few hundred thousand entries, resulting in a few hundred MB.
One table where things don't work properly though has already >6 million entries (potentially heading to 100 million) and uses >60 GB including indexes. I can insert, update and select by "id" but anything more complex (e.g. involving a search in 1 or 2 additional fields or sorting the results) runs into infinity. Example:
SELECT id FROM tablename WHERE extra = ''
This query would select entries where "extra" is empty. There is an index on "extra" and
EXPLAIN SELECT id FROM tablename WHERE extra = ''
tells me it is just a SIMPLE query with the correct index automatically chosen ("Using where; Using index"). If I set a low LIMIT I am fine, selecting thousands of results though and the query never stops running. Using more than 1 field in my search even with a combined index and explicitly adding the name of that index to the query and I'm out of luck as well.
Since there is more than enough storage available on my vServer and MariaDB/InnoDB don't have such low size limits for tables there might be some settings or other limitations that would prevent me from running queries on larger database tables. Looking through all the settings of MariaDB I couldn't find anything appropriate though.
Would be glad if someone could point me into the right direction.
What's the best way to store a high amount of data in a database?
I need to store values of various environmental sensors with timestamps.
I have done some benchmarks with SQLCE, it works fine for a few 100,000 rows, but if it goes to the millions, the selects will get horrible slow.
My actual tables:
Datapoint:[DatastreamID:int, Timestamp:datetime, Value:float]
Datastream: [ID:int{unique index}, Uint:nvarchar, Tag:nvarchar]
If I query for Datapoints of a specific Datastream and a date range, it takes ages. Especially if I run it on a embedded WindowsCE device. And that is the main problem. On my development machine a query took's ~1sek, but on the CE device it took's ~5min
every 5min I log 20 sensors, 12 per hour * 24h * 365days = 105,120 * 20 sensors = 2,102,400(rows) per year
But it could be even more sensors!
I thought about some kind of webservice backend, but the device may not always have a connection to the internet / server.
The data must be able to display on the device itself.
How can I speed up the things? choose an other table layout, use an other database (sqlite)? At the moment I use .netcf20 and SQLCE3.5
Some advices?
I'm sure any relational database would suit your needs. SQL Server, Oracle, etc. The important thing is to create good indexes so that your queries are efficient. If you have to do a table scan just to find a single record, it will be slow no matter which database you use.
If you always find yourself querying for a specific DataStreamID and Timestamp value, create an index for it. That way it will do an index seek instead of a scan.
The key to quick access is using one or more indexes.
A Database of two million rows in a year is very manageable.
Adding indexes will slow, somewhat, the INSERTS, but your data isn't coming in all that quickly, so it should not be an issue. If the data were coming in faster, you might have to be more careful, but it would have to be far more data in a far faster rate than you have now in order to be a concern.
Do you have access to SQL Server, or even MySQL?
Your design must have these:
Primary key in the table. Integer PK is faster.
You need to analyze your select queries to see what is going on behind the scene.
Select must do a SEEK instead of a scan
If 100K makes it slow, you must look at the query through analyzer.
It might get little slow if you have 100M rows, not 100K rows
Hope this helps
Can you use SQL Server Express Edition instead? You can create indexes on it just like in the full version. I've worked with databases that are over 100 million rows in SQL Server just fine. SQL Server Express Edition limits you database size to 10 GB so as long as that's okay then the free one should work for you.
http://www.microsoft.com/express/Database/
Is there any limit to maximum row of table in DBMS (specially MySQL)?
I want create table for saving logfile and it's row increase so fast I want know what shoud I do to prevent any problem.
I don't think there is an official limit, it will depend on maximum index sizes and filesystem restrictions.
From mySQL 5.0 Features:
Support for large databases. We use MySQL Server with databases that contain 50 million records. We also know of users who use MySQL Server with 200,000 tables and about 5,000,000,000 rows.
You should periodically move log rows out to a historical database for data mining and purge them from the transactional database. It's a common practice.
There's probably some sort of limitation, dependent on the engine used and the table structure. I've got a table with appx 45 million entries in a database I administrate, I've heard of (much) higher numbers.
I have a table of about 800 000 records. Its basically a log which I query often.
I gave condition to query only queries that were entered last month in attempt to reduce the load on a database.
My thinking is
a) if the database goes only through the first month and then returns entries, its good.
b) if the database goes through the whole database + checking the condition against every single record, it's actually worse than no condition.
What is your opinion?
How would you go about reducing load on a dbf?
If the field containing the entry date is keyed/indexed, and is used by the DB software to optimize the query, that should reduce the set of rows examined to the rows matching that date range.
That said, it's a commonly understood that you are better off optimizing queries, indexes, database server settings and hardware, in that order. Changing how you query your data can reduce the impact of a query a millionfold for a query that is badly formulated in the first place, depending on the dataset.
If there are no obvious areas for speedup in how the query itself is formulated (joins done correctly or no joins needed, or effective use of indexes), adding indexes to help your common queries would by a good next step.
If you want more information about how the database is going to execute your query; you can use the MySQL EXPLAIN command to find out. For example, that will tell you if it's able to use an index for the query.