Suggested way to resolve column name/type in a view - mysql

I have the following problem that I'm trying to find the best solution for. Let's say I have a view such as the following:
CREATE VIEW myView AS (
SELECT
country_code,
other_column,
COUNT(1) as cnt
FROM mytable
JOIN otherDatabase.otherTable ON (id)
GROUP BY 1,2 ORDER BY 1 LIMIT 1
)
What would be the fastest way to resolve the field names and types of the view? For example, on the above I am looking to get something along the lines of:
{
country_code: VARCHAR,
other_column: BOOL,
cnt: INT
}
The first approach is just to run the query (with a limit, if necessary) and then get the types of the result-set from the driver. The downside of this is what if the query takes 50 minutes to resolve?
The second approach I thought of is to 'follow' the columns to get their types and then do some parsing to resolve any expressions/literals/etc. This would involve a lot of code but would be orders of magnitude faster than the above. However, the potential downside of this is we may have access to the view but not have access to a table (possibly in another database on the server) that contains the column type, so it's possible we might not be able to resolve all field names.
What would be the best way to resolve the types of a view? Note I have tagged this as MySQL, but I'm also wondering if there's a more generic way to resolve types or if it's something that is non-standard and more needs to be done on a per-database basis?
Update: I believe the correct answer is just to run a DESCRIBE myView, and that would give me the column names and types without running the query?

In the current version of MySQL at least, INFORMATION_SCHEMA.COLUMNS holds metadata for views as well as base tables:
mysql> create table mytable (id serial primary key, x int);
Query OK, 0 rows affected (0.01 sec)
mysql> create view v as select * from mytable;
mysql> select column_name, data_type from information_schema.columns where table_name='v';
+-------------+-----------+
| COLUMN_NAME | DATA_TYPE |
+-------------+-----------+
| id | bigint |
| x | int |
+-------------+-----------+

A related issue...
SHOW CREATE TABLE myView;
or
SHOW CREATE VIEW myView;
will fully qualify all the columns.
(When writing a JOIN, it is wise to always qualify the column names.)

Related

What is the default select order in PostgreSQL or MySQL?

I have read in the PostgreSQL docs that without an ORDER statement, SELECT will return records in an unspecified order.
Recently on an interview, I was asked how to SELECT records in the order that they inserted without an PK or created_at or other field that can be used for order. The senior dev who interviewed me was insistent that without an ORDER statement the records will be returned in the order that they were inserted.
Is this true for PostgreSQL? Is it true for MySQL? Or any other RDBMS?
I can answer for MySQL. I don't know for PostgreSQL.
The default order is not the order of insertion, generally.
In the case of InnoDB, the default order depends on the order of the index read for the query. You can get this information from the EXPLAIN plan.
For MyISAM, it returns orders in the order they are read from the table. This might be the order of insertion, but MyISAM will reuse gaps after you delete records, so newer rows may be stored earlier.
None of this is guaranteed; it's just a side effect of the current implementation. MySQL could change the implementation in the next version, making the default order of result sets different, without violating any documented behavior.
So if you need the results in a specific order, you should use ORDER BY on your queries.
Following BK's answer, and by way of example...
DROP TABLE IF EXISTS my_table;
CREATE TABLE my_table(id INT NOT NULL) ENGINE = MYISAM;
INSERT INTO my_table VALUES (1),(9),(5),(8),(7),(3),(2),(6);
DELETE FROM my_table WHERE id = 8;
INSERT INTO my_table VALUES (4),(8);
SELECT * FROM my_table;
+----+
| id |
+----+
| 1 |
| 9 |
| 5 |
| 4 | -- is this what
| 7 |
| 3 |
| 2 |
| 6 |
| 8 | -- we expect?
+----+
In the case of PostgreSQL, that is quite wrong.
If there are no deletes or updates, rows will be stored in the table in the order you insert them. And even though a sequential scan will usually return the rows in that order, that is not guaranteed: the synchronized sequential scan feature of PostgreSQL can have a sequential scan "piggy back" on an already executing one, so that rows are read starting somewhere in the middle of the table.
However, this ordering of the rows breaks down completely if you update or delete even a single row: the old version of the row will become obsolete, and (in the case of an UPDATE) the new version can end up somewhere entirely different in the table. The space for the old row version is eventually reclaimed by autovacuum and can be reused for a newly inserted row.
Without an ORDER BY clause, the database is free to return rows in any order. There is no guarantee that rows will be returned in the order they were inserted.
With MySQL (InnoDB), we observe that rows are typically returned in the order by an index used in the execution plan, or by the cluster key of a table.
It is not difficult to craft an example...
CREATE TABLE foo
( id INT NOT NULL
, val VARCHAR(10) NOT NULL DEFAULT ''
, UNIQUE KEY (id,val)
) ENGINE=InnoDB;
INSERT INTO foo (id, val) VALUES (7,'seven') ;
INSERT INTO foo (id, val) VALUES (4,'four') ;
SELECT id, val FROM foo ;
MySQL is free to return rows in any order, but in this case, we would typically observe that MySQL will access rows through the InnoDB cluster key.
id val
---- -----
4 four
7 seven
Not at all clear what point the interviewer was trying to make. If the interviewer is trying to sell the idea, given a requirement to return rows from a table in the order the rows were inserted, a query without an ORDER BY clause is ever the right solution, I'm not buying it.
We can craft examples where rows are returned in the order they were inserted, but that is a byproduct of the implementation, ... not guaranteed behavior, and we should never rely on that behavior to satisfy a specification.

What is the "Default order by" for a mysql Innodb query that omits the Order by clause?

So i understand and found posts that indicates that it is not recommended to omit the order by clause in a SQL query when you are retrieving data from the DBMS.
Resources & Post consulted (will be updated):
SQL Server UNION - What is the default ORDER BY Behaviour
When no 'Order by' is specified, what order does a query choose for your record set?
https://dba.stackexchange.com/questions/6051/what-is-the-default-order-of-records-for-a-select-statement-in-mysql
Questions :
See logic of the question below if you want to know more.
My question is : under mysql with innoDB engine, does anyone know how the DBMS effectively gives us the results ?
I read that it is implementation dependent, ok, but is there a way to know it for my current implementation ?
Where is this defined exactly ?
Is it from MySQL, InnoDB , OS-Dependent ?
Isn't there some kind of list out there ?
Most importantly, if i omit the order by clause and get my result, i can't be sure that this code will still work with newer database versions and that the DBMS will never give me the same result, can i ?
Use case & Logic :
I'm currently writing a CRUD API, and i have table in my DB that doesn't contain an "id" field (there is a PK though), and so when i'm showing the results of that table without any research criteria, i don't really have a clue on what i should use to order the results. I mean, i could use the PK or any field that is never null, but it wouldn't make it relevant. So i was wondering, as my CRUD is supposed to work for any table and i don't want to solve this problem by adding an exception for this specific table, i could also simply omit the order by clause.
Final Note :
As i'm reading other posts, examples and code samples, i'm feeling like i want to go too far. I understand that it is common knowledge that it's just a bad practice to omit the Order By clause in a request and that there is no reliable default order clause, not to say that there is no order at all unless you specify it.
I'd just love to know where this is defined, and would love to learn how this works internally or at least where it's defined (DBMS / Storage Engine / OS-Dependant / Other / Multiple criteria). I think it would also benefit other people to know it, and to understand the inners mechanisms in place here.
Thanks for taking the time to read anyway ! Have a nice day.
Without a clear ORDER BY, current versions of InnoDB return rows in the order of the index it reads from. Which index varies, but it always reads from some index. Even reading from the "table" is really an index—it's the primary key index.
As in the comments above, there's no guarantee this will remain the same in the next version of InnoDB. You should treat it as a coincidental behavior, it is not documented and the makers of MySQL don't promise not to change it.
Even if their implementation doesn't change, reading in index order can cause some strange effects that you might not expect, and which won't give you query result sets that makes sense to you.
For example, the default index is the clustered index, PRIMARY. It means index order is the same as the order of values in the primary key (not the order in which you insert them).
mysql> create table mytable ( id int primary key, name varchar(20));
mysql> insert into mytable values (3, 'Hermione'), (2, 'Ron'), (1, 'Harry');
mysql> select * from mytable;
+----+----------+
| id | name |
+----+----------+
| 1 | Harry |
| 2 | Ron |
| 3 | Hermione |
+----+----------+
But if your query uses another index to read the table, like if you only access column(s) of a secondary index, you'll get rows in that order:
mysql> alter table mytable add key (name);
mysql> select name from mytable;
+----------+
| name |
+----------+
| Harry |
| Hermione |
| Ron |
+----------+
This shows it's reading the table by using an index-scan of that secondary index on name:
mysql> explain select name from mytable;
+----+-------------+---------+-------+---------------+------+---------+------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+---------+-------+---------------+------+---------+------+------+-------------+
| 1 | SIMPLE | mytable | index | NULL | name | 83 | NULL | 3 | Using index |
+----+-------------+---------+-------+---------------+------+---------+------+------+-------------+
In a more complex query, it can become very tricky to predict which index InnoDB will use for a given query. The choice can even change from day to day, as your data changes.
All this goes to show: You should just use ORDER BY if you care about the order of your query result set!
Bill's answer is good. But not complete.
If the query is a UNION, it will (I think) deliver first the results of the first SELECT (according to the rules), then the results of the second. Also, if the table is PARTITIONed, it is likely to do a similar thing.
GROUP BY may sort by the grouping expressions, thereby leading to a predictable order, or it may use a hashing technique, which scrambles the rows. I don't know how to predict which.
A derived table used to be an ordered list that propagates into the parent query's ordering. But recently, the ORDER BY is being thrown away in that subquery! (Unless there is a LIMIT.)
Bottom Line: If you care about the order, add an ORDER BY, even if it seems unnecessary based on this Q & A.
MyISAM, in contrast, starts with this premise: The default order is the order in the .MYD file. But DELETEs leave gaps, UPDATEs mess with the gaps, and INSERTs prefer to fill in gaps over appending to the file. So, the row order is rather unpredictable. ALTER TABLE x ORDER BY y temporarily sets the .MYD order; this 'feature' does not work for InnoDB.

MySQL - Copying partial data from one table to another

This may be a silly question, and I understand why I'm getting the result that I am, however, I thought mySQL acted differently and I can't finish the documentation to tell me otherwise.
I have 2 basic tables as follows:
CREATE TABLE test ( num INT, time TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP );
CREATE TABLE test_to_copy ( num INT );
I then create a single entry into the test_to_copy table:
INSERT INTO test_to_copy VALUES ( 12 );
Now I try and copy the table test_to_copy to test like so:
INSERT INTO test SELECT * FROM test_to_copy;
The error that keeps getting thrown is
"Column count doesn't match value count at row 1".
I know that it is complaining that the number of columns in both tables does not match meaning it does not know what variable we are assigning our copy to, however, should it not be a case where the time is created automatically i.e. defaulted if nothing is inserted when we do the copy rather than throw the error?
Due to constraints, I can no longer have the time in both tables, and I must do a SELECT * on the test_to_copy table as there are over 50 columns, and i'm wondering is there an easy way around this?
This is another variation of a frequent question: "can I query *-except-for-one-column?"
No, there is no wildcard-with-exceptions syntax in SQL. The * wildcard means all columns. If you don't want all columns, you must name the columns explicitly.
If you have a variety of columns because this method may be used for more than one table, you can get the list of columns for any given table from INFORMATION_SCHEMA.COLUMNS and use that information to build a dynamic SQL query.
Here's a way you can produce the list of columns:
SELECT
GROUP_CONCAT(
CONCAT('`', column_name, '`')
) AS _cols
FROM INFORMATION_SCHEMA.COLUMNS
WHERE TABLE_SCHEMA='mydatabase' AND TABLE_NAME='mytable'
AND COLUMN_NAME NOT IN ('time'); -- or other columns to exclude
See also:
Select all columns except one in MySQL?
SQL exclude a column using SELECT * [except columnA] FROM tableA?
INSERT INTO test (num)
SELECT num
FROM test_to_copy

COUNT() and EXPLAIN row's count are getting different [duplicate]

mysql> select count(*) from table where relation_title='xxxxxxxxx';
+----------+
| count(*) |
+----------+
| 1291958 |
+----------+
mysql> explain select * from table where relation_title='xxxxxxxxx';
+----+-------------+---------+-
| id | select_type | rows |
+----+-------------+---------+-
| 1 | SIMPLE | 1274785 |
+----+-------------+---------+-
I think that "explain select * from table where relation_title='xxxxxxxxx';" returns the rows of relation_title='xxxxxxxxx' by index. But it's small than the true num.
It is showing how many rows it ran through to get your result.
The reason for the wrong data is that EXPLAIN is not accurate, it makes guesses about your data based on information stored about your table.
This is very useful information, for example when doing JOINS on many tables and you want to be sure that you aren't running through the entire joined table for one row of information for each row you have.
Here's a test on a 608 row table.
SELECT COUNT(id) FROM table WHERE user_id = 1
Result:
COUNT(id)
512
And here's the explain
EXPLAIN SELECT COUNT(id) FROM table WHERE user_id = 1
Result:
id rows
1 608
The EXPLAIN query will use the value provided in the INFORMATION_SCHEMA table, which contains a rough estimate of the row count for innodb tables - see notes section in mysql docs on INFORMATION_SCHEMA.TABLES.
Execute ANALYZE TABLE table_name; - it will update statistics that EXPLAIN uses, and you'll get correct numbers. For example: when there is no data in table at all, EXPLAIN will suggest that this table is empty and optimize queries to filter first based on that table (as it doesn't read anything from disk, memory and so on). Then when data will be loaded if you don't execute ANALYZE TABLE table_name;, optimizer still suggests that table is still empty, and does not use an optimal execution plan for query. EXPLAIN behaves same way - it doesn't look for current count of rows in table, it looks for statistics generated by ANALYZE TABLE table name (that is executed automatically in some situations - 1/16 of count of rows in table changed for example).
The question of the OP is valid, but judging from the answers, I think there is a misunderstanding of what the rows column of explain is actually telling us.
the mysql documentation for explain rows states:
The rows column indicates the number of rows MySQL believes it must examine to execute the query.
So what COUNT(*) tells you and what Explain rows tells you are two different things that MAY happen to result in the same number, but they are not the same information.
The first is a count of all the rows that match the query being run. The second, is an estimate of all the rows that need to be examined for the query to run.
So when i run
SELECT COUNT(id) FROM table WHERE user_id = 1
i get the number of rows where user_id = 1.
When i run
EXPLAIN SELECT COUNT(id) FROM table WHERE user_id = 1
the rows column contains all the rows that mysql needs to go through to give you that answer. That's the whole table in this case.

Find column that contains a given value in MySQL

I have a table in a MySQL database. I am given a value that occurs as a cell value in that table but I do not know which cell is it i.e. the row and column of that cell. What is the most efficient way to find the column to which that value belongs? Thanks in advance.
Example:
Column_1 | Column_2 | Column_3
1 | 2 | 3
4 | 5 | 6
7 | 8 | 9
Now I am given an input value of "8". I want to know if there is an efficient way to find out that value of "8" belongs to Column_2.
It's a bit strange that you don't know which column the data is in, since columns are meant to have a well-defined function.
[Original response scrubbed.]
EDIT: Your updated post just asks for the column. In that case, you don't need the view, and can just run this query
SELECT col FROM (
SELECT "Column_1" AS col, Column_1 AS value FROM YourTable
UNION ALL SELECT "Column_2", Column_2 FROM YourTable
UNION ALL SELECT "Column_3", Column_3 FROM YourTable
) allValues
WHERE value=8;
When you run this query against your table, it will return "Column_2"
Without knowing more about your app, you have several options:
Use MySQL's built-in full-text search. You can check the MATCH function in the MySQL documentation.
Depending on the needs of your app you could decide to index your whole table with an external full-text search index, like Solr or Sphynx. This provides instant response time, but you'll need to keep the index updated.
You can loop through all the columns in the table doing a LIKE query in MySQL (very expensive in CPU and time)
You're designing this table with repeating groups, which is not satisfying First Normal Form.
You should create a second table and store the values for column1, column2, and column2 in a single column, on three rows.
Learn about the rules of database normalization for more details.