This might sounds like a simple question, but the one that I haven't been able to find a satisfying answer to.
In MySQL, if you are using a Unix Based machines, the tables names and column names are case in-sensitive when running the queries as long as you follow Camel Case in the names.
Now, here is the tricky part, Lets assume the following table structure.
> desc table1
+-------+-------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-------+-------------+------+-----+---------+-------+
| name | varchar(10) | YES | | NULL | |
| id | int(11) | NO | PRI | NULL | |
+-------+-------------+------+-----+---------+-------+
> desc table2
+-------+-------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-------+-------------+------+-----+---------+-------+
| name1 | varchar(10) | YES | | NULL | |
| id | int(11) | NO | PRI | NULL | |
+-------+-------------+------+-----+---------+-------+
> desc table_view
+-------+-------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-------+-------------+------+-----+---------+-------+
| id | int(11) | NO | | NULL | |
| name | varchar(10) | YES | | NULL | |
| name1 | varchar(10) | YES | | NULL | |
+-------+-------------+------+-----+---------+-------+
Now, if I run the Queries on my table with column names mentioned in Upper case, the result will have the column name in Upper case and Lower case, if the query has lower case column names.
> select name from table1
+-------+
| name |
+-------+
| test |
| test2 |
+-------+
> select NAME from table1
+-------+
| NAME |
+-------+
| test |
| test2 |
+-------+
However, when I run the same queries on the View, the column names match the case of the string that were used while creating the view. i.e. the resulting data will have the same column as the view and not what is used in the query.
> select name from table_view
+-------+
| name |
+-------+
| test |
| test2 |
+-------+
> select NAME from table_view
+-------+
| name |
+-------+
| test |
| test2 |
+-------+
Is there is an easy way to fix this problem and make sure that the views also follow the same standard or format as the tables would when responding with query results ?
PS : We have some legacy code where we are running out of Table Row Size limit when creating new tables due to the number and size of the Columns. So we split them dynamically into n tables and joined them into a view with the same name as previous table to make sure we didn't have to do much of a code change.
But the case issue with the view is breaking things. Any help will be greatly appreciated.
Related
I am still new to mysql commands and I learned how to get the max value of a column and count of a column. My database is named db and my table (named "foo") looks like this:
+----------+-------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+----------+-------------+------+-----+---------+-------+
| id | varchar(64) | NO | PRI | NULL | |
| expires | datetime | YES | MUL | NULL | |
| extra | text | YES | | NULL | |
| valid | tinyint(1) | NO | | NULL | |
| trust_id | varchar(64) | YES | MUL | NULL | |
| user_id | varchar(64) | YES | MUL | NULL | |
+----------+-------------+------+-----+---------+-------+
Basically, I want to know if there is a way to get the max date of the expires field and then print out the corresponding fields like user_id and id. So far this is the command I currently use to get the max date from expires in the command line
mysql -u root db -e "select max(id) from foo;"
this alone will give me:
+---------------------+
| max(expires) |
+---------------------+
| 2016-08-05 17:54:44 |
+---------------------+
What I would like to do is get this back:
+---------------------+--------+--------+
| max(expires) | id | user_id|
+---------------------+--------+--------+
| 2016-08-05 17:54:44 |cd97eb4 | 2bf2cec|
+---------------------+--------+--------+
So then I would have the max expiration date along with the id and user id for that expiration date.
Any insight is greatly appreciated!
Use order by and limit:
select f.*
from foo f
order by f.expires desc
limit 1;
I think this is the simplest way. And with an index on foo(expires), it should have the best performance as well.
If you just want the id and user_id corresponding to that (max) date, then something like this should do
select expires, id, user_id
from foo
where expires in
(select max(expires)
from foo)
I ran a somewhat nonsense query on MySQL, but because its output is the same each time, I'm wondering if someone can help me understand the underlying algorithm.
Here's the table Orders on which we'll execute the query (database taken from here, just in case someone's interested):
+----------------+-------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+----------------+-------------+------+-----+---------+-------+
| orderNumber | int(11) | NO | PRI | NULL | |
| orderDate | date | NO | | NULL | |
| requiredDate | date | NO | | NULL | |
| shippedDate | date | YES | | NULL | |
| status | varchar(15) | NO | | NULL | |
| comments | text | YES | | NULL | |
| customerNumber | int(11) | NO | MUL | NULL | |
+----------------+-------------+------+-----+---------+-------+
There are 326 records for now, with the largest orderNumber being 10425.
Now here's the query I ran (basically removed GROUP BY from a sensible query):
mysql> select count(1), orderNumber, status from orders;
+----------+-------------+---------+
| count(1) | orderNumber | status |
+----------+-------------+---------+
| 326 | 10100 | Shipped |
+----------+-------------+---------+
1 row in set (0.00 sec)
So I'm asking for the total number of rows, along with status and orderNumber, which can be just about anything under the given circumstances. But the query always returns orderNumber 10100, even if I log out and run it again.
Is there a predictable answer for this?
There's no predictable answer for which you should use in your design. In general, the DB will return the values of the first row that matches the query. If you want predictability, you should apply an aggregate to every column (e.g. using MIN or MAX to always get smallest/largest value)
I am new to MySQL, and trying to using MySQL on the project, basically was tracking players performance.
Below is the table fields.
+-------------------+----------------------+-------------------+------+-----+---------+----------------+---------------------------------+---------+
| Field | Type | Collation | Null | Key | Default | Extra | Privileges | Comment |
+-------------------+----------------------+-------------------+------+-----+---------+----------------+---------------------------------+---------+
| unique_id | int(11) | NULL | NO | PRI | NULL | auto_increment | select,insert,update,references | |
| record_time | datetime | NULL | NO | | NULL | | select,insert,update,references | |
| game_sourceid | char(20) | latin1_swedish_ci | NO | MUL | NULL | | select,insert,update,references | |
| game_number | smallint(6) | NULL | NO | | NULL | | select,insert,update,references | |
| game_difficulty | char(12) | latin1_swedish_ci | NO | MUL | NULL | | select,insert,update,references | |
| cost_time | smallint(5) unsigned | NULL | NO | MUL | NULL | | select,insert,update,references | |
| country | char(3) | latin1_swedish_ci | NO | | NULL | | select,insert,update,references | |
| source | char(7) | latin1_swedish_ci | NO | | NULL | | select,insert,update,references | |
+-------------------+----------------------+-------------------+------+-----+---------+----------------+---------------------------------+---------+
and I have adding game_sourceid and game_difficulty as index and the engine is innodb.
I have insert about 11m rows of test data into this table, which is generated randomly but resembles the real data.
Basically the mostly query was like this, to get the average time and best time for a specific game_sourceid
SELECT avg(cost_time) AS avgtime
, min(cost_time) AS mintime
, count(*) AS count
FROM statistics_work_table
WHERE game_sourceid = 'standard_easy_1';
+-----------+---------+--------+
| avgtime | mintime | count |
+-----------+---------+--------+
| 1681.2851 | 420 | 138034 |
+-----------+---------+--------+
1 row in set (4.97 sec)
and the query took about 5s
I have googled about this and someone said that may caused by the amout of query count, so I am trying to narrow down the scope like this
SELECT avg(cost_time) AS avgtime
, min(cost_time) AS mintime
, count(*) AS count
FROM statistics_work_table
WHERE game_sourceid = 'standard_easy_1'
AND record_time > '2015-11-19 04:40:00';
+-----------+---------+-------+
| avgtime | mintime | count |
+-----------+---------+-------+
| 1275.2222 | 214 | 9 |
+-----------+---------+-------+
1 row in set (4.46 sec)
As you can see the 9 rows data also took about 5s, so i think it's not the problem about the query count.
The test data was generated randomly to simulate the real user's activity, so the data was discontinuous, so i added more continuous data(about 250k) with the same game_sourceid='standard_easy_9' but keep all others randomly, in other words the last 250k rows in this table has the same game_sourceid. And i'm trying to query like this:
SELECT avg(cost_time) AS avgtime
, min(cost_time) AS mintime
, count(*) AS count
FROM statistics_work_table
WHERE game_sourceid = 'standard_easy_9';
+-----------+---------+--------+
| avgtime | mintime | count |
+-----------+---------+--------+
| 1271.4806 | 70 | 259379 |
+-----------+---------+--------+
1 row in set (0.40 sec)
This time the query magically took only 0.4s, that's totally beyond my expectations.
So here's the question, the data was retrived from the player at real time, so it must be randomly and discontinuous.
I am thinking of separating the data into multiple tables by the game_sourceid, but it will take another 80 tables for that, maybe more in the future.
Since I am new to MySQL, I am wondering if there are any other solutions for this, or just my query was too bad.
Update: Here's the index of my table
mysql> show index from statistics_work_table;
+-----------------------+------------+-------------------------+--------------+-----------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+-----------------------+------------+-------------------------+--------------+-----------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| statistics_work_table | 0 | PRIMARY | 1 | unique_id | A | 11362113 | NULL | NULL | | BTREE | | |
| statistics_work_table | 1 | GameSourceId_CostTime | 1 | game_sourceid | A | 18 | NULL | NULL | | BTREE | | |
| statistics_work_table | 1 | GameSourceId_CostTime | 2 | cost_time | A | 344306 | NULL | NULL | | BTREE | | |
+-----------------------+------------+-------------------------+--------------+-----------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
ALTER TABLE `statistics_work_table`
ADD INDEX `GameSourceId_CostTime` (`game_sourceid`,`cost_time`)
This index should make your queries super fast. Also, after you run the above statement, you should drop the single column index you have on game_sourceid, as the above will make the single column one redundant. (Which will hurt insert speed.)
The reason your queries are slow is because the database is using your single column index on game_sourceid, finding the rows, and then, for each row, using the primary key that is stored along with the index to find the main clustered index (aka primary key in this, and most cases), and then looking up the cost_time value. This is referred to as a double lookup, and it is something you want to avoid.
The index I provided above is called a "covering index". It allows your query to use ONLY the index, and so you only need a single lookup per row, greatly improving performance.
I am designing a DB for a possible PHP MySQL project I may be undertaking.
I am a complete novice at relational DB design, and have only worked with single table DB's before.
This is a diagram of the tables:
So, 'Cars' contains each model of car, and the other 3 tables contains parts that the car can be fitted with.
So each car can have different parts from each of the three tables, and each part can be fitted to different cars from the parts table. In reality, there will be about 10 of these parts tables.
So, what would be the best way to link these together? do I need another table in the middle etc?
and what would I need to do with keys in terms of linking.
There is some inheritance in your parts. The common attributes seem to be:
part_number
price
and there are some specifics for your part types exhaust, software and intake.
There are two strategies:
- have three tables and one view over the three tables
- have one table with a parttype column and may be three views for the tables.
If you'd like to play with your design you might want to look at my companies website http://www.uml2php.com. UML2PHP will automatically convert your UML design to a database design and let you "play" with the result.
At:
http://service.bitplan.com/uml2phpexamples/carparts/
you'll find an example applicaton along your design. The menu does not allow you to access all tables via the menu yet.
via:
http://service.bitplan.com/uml2phpexamples/carparts/index.php?function=dbCheck
the table definitions are accessible:
mysql> describe CP01_car;
+-------------+---------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-------------+---------------+------+-----+---------+-------+
| oid | varchar(32) | NO | | NULL | |
| car_id | varchar(255) | NO | PRI | NULL | |
| model | varchar(255) | YES | | NULL | |
| description | text | YES | | NULL | |
| model_year | decimal(10,0) | YES | | NULL | |
+-------------+---------------+------+-----+---------+-------+
mysql> describe CP01_part;
+-------------+--------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-------------+--------------+------+-----+---------+-------+
| oid | varchar(32) | NO | | NULL | |
| part_number | varchar(255) | NO | PRI | NULL | |
| price | varchar(255) | YES | | NULL | |
| car_car_id | varchar(255) | YES | | NULL | |
+-------------+--------------+------+-----+---------+-------+
mysql> describe cp01_exhaust;
+-------------+--------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-------------+--------------+------+-----+---------+-------+
| oid | varchar(32) | NO | | NULL | |
| type | varchar(255) | YES | | NULL | |
| part_number | varchar(255) | NO | PRI | NULL | |
| price | varchar(255) | YES | | NULL | |
+-------------+--------------+------+-----+---------+-------+
mysql> describe CP01_intake;
+-------------+--------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-------------+--------------+------+-----+---------+-------+
| oid | varchar(32) | NO | | NULL | |
| part_number | varchar(255) | NO | PRI | NULL | |
| price | varchar(255) | YES | | NULL | |
+-------------+--------------+------+-----+---------+-------+
mysql> describe CP01_software;
+-------------+---------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-------------+---------------+------+-----+---------+-------+
| oid | varchar(32) | NO | | NULL | |
| power_gain | decimal(10,0) | YES | | NULL | |
| part_number | varchar(255) | NO | PRI | NULL | |
| price | varchar(255) | YES | | NULL | |
+-------------+---------------+------+-----+---------+-------+
The above tables have been generated from the UML model and the result does not fit your needs yet.
Especially if you think of having 10 or more table likes this. The field car_car_id that links your parts to the car table should be available in all the tables. And according to the design proposal the base "table" for the parts should be a view like this:
mysql>
create view partview as
select oid,part_number,price from CP01_software
union select oid,part_number,price from CP01_exhaust
union select oid,part_number,price from CP01_intake;
of course the car_car_id column also needs to be selected;
Now you can edit every table by itself and the partview will show all parts together.
To be able to distinguish the parts types you might want to add another column "part_type".
I would do it like this. Instead of having three different tables for car parts:
table - cars table - parts (this would have only an id and a part
number and a type maybe)
table - part_connections (connectin cars with parts)
table - part_options (with all the options which arent in the
parts table like "power gain")
table - part_option_connections (which connects the parts to the
various part options)
In this way it is much easier to add new parts (because you won't need a new table) and its closer to being normalized as well.
I have a question about, how to analyze a query to know performance of its (good or bad).
I searched a lot and got something like below:
SELECT count(*) FROM users; => Many experts said it's bad.
SELECT count(id) FROM users; => Many experts said it's good.
Please see the table:
+---------------+-------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+---------------+-------------+------+-----+---------+----------------+
| userId | int(11) | NO | PRI | NULL | auto_increment |
| f_name | varchar(50) | YES | | NULL | |
| l_name | varchar(50) | YES | | NULL | |
| user_name | varchar(50) | NO | | NULL | |
| password | varchar(50) | YES | | NULL | |
| email | varchar(50) | YES | | NULL | |
| active | char(1) | NO | | Y | |
| groupId | smallint(4) | YES | MUL | NULL | |
| created_date | datetime | YES | | NULL | |
| modified_date | datetime | YES | | NULL | |
+---------------+-------------+------+-----+---------+----------------+
But when I try to using EXPLAIN command for that, I got the results:
EXPLAIN SELECT count(*) FROM `user`;
+----+-------------+-------+-------+---------------+---------+---------+------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+-------+---------------+---------+---------+------+------+-------------+
| 1 | SIMPLE | user | index | NULL | groupId | 3 | NULL | 83 | Using index |
+----+-------------+-------+-------+---------------+---------+---------+------+------+-------------+
1 row in set (0.00 sec)
EXPLAIN SELECT count(userId) FROM user;
+----+-------------+-------+-------+---------------+---------+---------+------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+-------+---------------+---------+---------+------+------+-------------+
| 1 | SIMPLE | user | index | NULL | groupId | 3 | NULL | 83 | Using index |
+----+-------------+-------+-------+---------------+---------+---------+------+------+-------------+
1 row in set (0.00 sec)
So, the first thing for me:
Can I understand it's the same performance?
P/S: MySQL version is 5.5.8.
No, you cannot. Explain doesn't reflect all the work done by mysql, it just gives you a plan of how it will be performed.
What about specifically count(*) vs count(id). The first one is always not slower than the second, and in some cases it is faster.
count(col) semantic is amount of not null values, while count(*) is - amount of rows.
Probably mysql can optimize count(col) by rewriting into count(*) as well as id is a PK thus cannot be NULL (if not - it looks up for NULLS, which is not fast), but I still propose you to use COUNT(*) in such cases.
Also - the internall processes depend on used storage engine. For myisam the precalculated number of rows returned in both cases (as long as you don't use WHERE).
In the example you give the performance is identical.
The execution plan shows you that the optimiser is clever enough to know that it should use the Primary key to find the total number of records when you use count(*).
There is not significant difference when it comes on counting. The reason is that most optimizers will figure out the best way to count rows by themselves.
The performance difference comes to searching for values and lack of indexing. So if you search for a field that has no index assigned {f_name,l_name} and a field that has{userID(mysql automatically use index on primary keys),groupID(seems like foraign key)} then you will see the difference in performance.