This is an excerpt from one table:
| id | type | other_id | def_id | ref_def_id|
| 1 | int | NULL | 5 | NULL |
| 2 | string | NULL | 5 | NULL |
| 3 | int | NULL | 5 | NULL |
| 20 | ref | 3 | NULL | 5 |
| 21 | ref | 4 | NULL | 5 |
| 22 | ref | 5 | NULL | 5 |
What I want is to find entries with type ref. Then I would for example have this one entry in my result:
| 22 | ref | 5 | NULL | 5 |
The problem I am facing is that I now want to combine this entry with other entries of the same table where def_id = 5.
So I would get all entries with def_id = 5 for this specific ref type as result. I somehow need the output from my first query, check what the ref_def_id is and then make another query for this id.
I really have problems to understand how to proceed. Any input is much appreciated.
If I understand correctly you need to find rows with a type of 'ref' and then use the values in their ref_def_id columns to get the rows with the same values in def_id. In that case you need to use a subquery for getting the rows with 'ref' type and combine it using either IN or EXISTS:
select *
from YourTable
where def_id in (select ref_def_id from YourTable where type='ref');
select *
from YourTable
where exists (select * from YourTable yt
where yt.ref_def_id=YourTable.def_id and yt.type='ref')
Both queries are equivalent, IN is easier to understand at first sight but EXISTS allow more complex conditions (for example you can use more than one column for combining with the subquery).
Edit: since you comment that you need also the id from the 'ref' rows then you need to use a subquery:
select source_id, YourTable.*
from YourTable
join (select id as source_id, ref_def_id
from YourTable
where type='ref')
as refs on refs.ref_def_id=YourTable.def_id
order by source_id, id;
With this for each 'ref' row you would get all the rows with the associated ref_id.
use below query to get column from sub query.
select a.ref_def_id
from (select ref_def_id from YourTable where type='ref') as a;
What you are looking for is a subquery or even better a join operation.
Have a look here: http://www.mysqltutorial.org/mysql-left-join.aspx
Joins / the left join allows you to combine rows of tables within one query on a given condition. The condition could be id = 5 for your purpose.
You would seem to want aggregation:
select max(id) as id, type, max(other_id) as other_id,
max(def_id) as def_id, ref_def_id
from t
where type = 'ref'
group by type, ref_def_id
Related
I have a record table and its comment table, like:
| commentId | relatedRecordId | isRead |
|-----------+-----------------+--------|
| 1 | 1 | TRUE |
| 2 | 1 | FALSE |
| 3 | 1 | FALSE |
Now I want to select newCommentCount and allCommentCount as a server response to the browser. Is there any way to select these two fields in one SQL?
I've tried this:
SELECT `isRead`, count(*) AS cnt FROM comment WHERE relatedRecordId=1 GROUP BY `isRead`
| isRead | cnt |
| FALSE | 2 |
| TRUE | 1 |
But, I have to use a special data structure to map it and sum the cnt fields in two rows to get allCommentCount by using an upper-layer programming language. I want to know if I could get the following format of data by SQL only and in one step:
| newCommentCount | allCommentCount |
|-----------------+-----------------|
| 2 | 3 |
I don't even know how to describe the question. So I got no any search result in Google and Stackoverflow. (Because of My poor English, maybe)
Use conditional aggregation:
SELECT SUM(NOT isRead) AS newCommentCount, COUNT(*) AS allCommentCount
FROM comment
WHERE relatedRecordId = 1;
if I under stand you want show sum of newComments Count and all comments so you can do it like
SELECT SUM ( CASE WHEN isRead=false THEN 1 ELSE 0 END ) AS newComment,
Count(*) AS AllComments From comments where relatedRecord=1
also you can make store procedure for it.
To place two result sets horizontally, you can as simple as use a subquery for an expression in the SELECT CLAUSE as long as the number of rows from the result sets match:
select (select count(*) from c_table where isread=false and relatedRecordId=1 ) as newCommentCount,
count(*) as allCommentCount
from c_table where relatedRecordId=1;
Consider a typical GROUP BY statement in SQL: you have a table like
+------+-------+
| Name | Value |
+------+-------+
| A | 1 |
| B | 2 |
| A | 3 |
| B | 4 |
+------+-------+
And you ask for
SELECT Name, SUM(Value) as Value
FROM table
GROUP BY Name
You'll receive
+------+-------+
| Name | Value |
+------+-------+
| A | 4 |
| B | 6 |
+------+-------+
In your head, you can imagine that SQL generates an intermediate sorted table like
+------+-------+
| Name | Value |
+------+-------+
| A | 1 |
| A | 3 |
| B | 2 |
| B | 4 |
+------+-------+
and then aggregates together successive rows: the "Value" column has been given an aggregator (in this case SUM), so it's easy to aggregate. The "Name" column has been given no aggregator, and thus uses what you might call the "trivial partial aggregator": given two things that are the same (e.g. A and A), it aggregates them into a single copy of one of the inputs (in this case A). Given any other input it doesn't know what to do and is forced to begin aggregating anew (this time with the "Name" column equal to B).
I want to do a more exotic kind of aggregation. My table looks like
+------+-------+
| Name | Value |
+------+-------+
| A | 1 |
| BC | 2 |
| AY | 3 |
| AZ | 4 |
| B | 5 |
| BCR | 6 |
+------+-------+
And the intended output is
+------+-------+
| Name | Value |
+------+-------+
| A | 8 |
| B | 13 |
+------+-------+
Where does this come from? A and B are the "minimal prefixes" for this set of names: they occur in the data set and every Name has exactly one of them as a prefix. I want to aggregate data by grouping rows together when their Names have the same minimal prefix (and add the Values, of course).
In the toy grouping model from before, the intermediate sorted table would be
+------+-------+
| Name | Value |
+------+-------+
| A | 1 |
| AY | 3 |
| AZ | 4 |
| B | 5 |
| BC | 2 |
| BCR | 6 |
+------+-------+
Instead of using the "trivial partial aggregator" for Names, we would use one that can aggregate X and Y together iff X is a prefix of Y; in that case it returns X. So the first three rows would be aggregated together into a row with (Name, Value) = (A, 8), then the aggregator would see that A and B couldn't be aggregated and would move on to a new "block" of rows to aggregate.
The tricky thing is that the value we're grouping by is "non-local": if A were not a name in the dataset, then AY and AZ would each be a minimal prefix. It turns out that the AY and AZ rows are aggregated into the same row in the final output, but you couldn't know that just by looking at them in isolation.
Miraculously, in my use case the minimal prefix of a string can be determined without reference to anything else in the dataset. (Imagine that each of my names is one of the strings "hello", "world", and "bar", followed by any number of z's. I want to group all of the Names with the same "base" word together.)
As I see it I have two options:
1) The simple option: compute the prefix for each row and group by that value directly. Unfortunately I have an index on the Name, and computing the minimal prefix (whose length depends on the Name itself) prevents me from using that index. This forces a full table scan, which is prohibitively slow.
2) The complicated option: somehow convince MySQL to use the "partial prefix aggregator" for Name. This runs into the "non-locality" problem above, but that's fine as long as we scan the table according to my index on Name, since then every minimal prefix will be encountered before any of the other strings it is a prefix of; we would never try to aggregate AY and AZ together if A were in the dataset.
In a declarative programming language #2 would be rather easy: extract rows one at a time, in alphabetical order, keeping track of the current prefix. If your new row's Name has that as a prefix, it goes in the bucket you're currently using. Otherwise, start a new bucket with that as your prefix. In MySQL I am lost as to how to do it. Note that the set of minimal prefixes is not known beforehand.
Edit 2
It occurred to me that if the table is ordered by Name, this would be a lot easier (and faster). Since I don't know if your data is sorted, I've included a sort in this query, but if the data is sorted, you can strip out (SELECT * FROM table1 ORDER BY Name) t1 and just use FROM table1
SELECT prefix, SUM(`Value`)
FROM (SELECT Name, Value, #prefix:=IF(Name NOT LIKE CONCAT(#prefix, '_%'), Name, #prefix) AS prefix
FROM (SELECT * FROM table1 ORDER BY Name) t1
JOIN (SELECT #prefix := '~') p
) t2
GROUP BY prefix
Updated SQLFiddle
Edit
Having slept on the problem, I realised that there is no need to do the IN, it's enough to just have a WHERE NOT EXISTS clause on the JOINed table:
SELECT t1.Name, SUM(t2.Value) AS `Value`
FROM table1 t1
JOIN table1 t2 ON t2.Name LIKE CONCAT(t1.Name, '%')
WHERE NOT EXISTS (SELECT *
FROM table1 t3
WHERE t1.Name LIKE CONCAT(t3.Name, '_%')
)
GROUP BY t1.Name
Updated Explain (Name changed to UNIQUE key from PRIMARY)
id select_type table type possible_keys key key_len ref rows Extra
1 PRIMARY t1 index Name Name 11 NULL 6 Using where; Using index; Using temporary; Using filesort
1 PRIMARY t2 ALL NULL NULL NULL NULL 6 Using where; Using join buffer (Block Nested Loop)
3 DEPENDENT SUBQUERY t3 index NULL Name 11 NULL 6 Using where; Using index
Updated SQLFiddle
Original Answer
Here is one way you could do it. First, you need to find all the unique prefixes in your table. You can do that by looking for all values of Name where it does not look like another value of Name with other characters on the end. This can be done with this query:
SELECT Name
FROM table1 t1
WHERE NOT EXISTS (SELECT *
FROM table1 t2
WHERE t1.Name LIKE CONCAT(t2.Name, '_%')
)
For your sample data, that will give
Name
A
B
Now you can sum all the values where the Name starts with one of those prefixes. Note we change the LIKE pattern in this query so that it also matches the prefix, otherwise we wouldn't count the values for A and B in your example:
SELECT t1.Name, SUM(t2.Value) AS `Value`
FROM table1 t1
JOIN table1 t2 ON t2.Name LIKE CONCAT(t1.Name, '%')
WHERE t1.Name IN (SELECT Name
FROM table1 t3
WHERE NOT EXISTS (SELECT *
FROM table1 t4
WHERE t3.Name LIKE CONCAT(t4.Name, '_%')
)
)
GROUP BY t1.Name
Output:
Name Value
A 8
B 13
An EXPLAIN says that both of these queries use the index on Name, so should be reasonably efficient. Here is the result of the explain on my MySQL 5.6 server:
id select_type table type possible_keys key key_len ref rows Extra
1 PRIMARY t1 index PRIMARY PRIMARY 11 NULL 6 Using index; Using temporary; Using filesort
1 PRIMARY t3 eq_ref PRIMARY PRIMARY 11 test.t1.Name 1 Using where; Using index
1 PRIMARY t2 ALL NULL NULL NULL NULL 6 Using where; Using join buffer (Block Nested Loop)
3 DEPENDENT SUBQUERY t4 index NULL PRIMARY 11 NULL 6 Using where; Using index
SQLFiddle Demo
Here are some hints on how to do the task. This locates any prefixes that are useful. That's not what you asked for, but the flow of the query and the usage of #variables, plus the need for 2 (actually 3) levels of nesting, might help you.
SELECT DISTINCT `Prev`
FROM
(
SELECT #prev := #next AS 'Prev',
#next := IF(LEFT(city, LENGTH(#prev)) = #prev, #next, city) AS 'Next'
FROM ( SELECT #next := ' ' ) AS init
JOIN ( SELECT DISTINCT city FROM us ) AS dedup
ORDER BY city
) x
WHERE `Prev` = `Next` ;
Partial output:
+----------------+
| Prev |
+----------------+
| Alamo |
| Allen |
| Altamont |
| Ames |
| Amherst |
| Anderson |
| Arlington |
| Arroyo |
| Auburn |
| Austin |
| Avon |
| Baker |
Check the Al% cities:
mysql> SELECT DISTINCT city FROM us WHERE city LIKE 'Al%' ORDER BY city;
+-------------------+
| city |
+-------------------+
| Alabaster |
| Alameda |
| Alamo | <--
| Alamogordo | <--
| Alamosa |
| Albany |
| Albemarle |
...
| Alhambra |
| Alice |
| Aliquippa |
| Aliso Viejo |
| Allen | <--
| Allen Park | <--
| Allentown | <--
| Alliance |
| Allouez |
| Alma |
| Aloha |
| Alondra Park |
| Alpena |
| Alpharetta |
| Alpine |
| Alsip |
| Altadena |
| Altamont | <--
| Altamonte Springs | <--
| Alton |
| Altoona |
| Altus |
| Alvin |
+-------------------+
40 rows in set (0.01 sec)
I am trying to write a query which can get invalid refby(is related to id), please check following db structure...
| id | acnumber | refby |
+----+-----------+--------+
| 1 | ac01 | 2 |
+----+-----------+--------+
| 2 | ac02 | 1 |
+----+-----------+--------+
| 3 | ac03 | 5 |
+----+-----------+--------+
As you can find there is no id with value of 5 in above table so query must return 3rd row as result.
I have tried...
SELECT * FROM tbl.members WHERE refby != (SELECT id FROM tbl.members WHERE id = refby)
But this is not giving correct results, please help, thanks.
SELECT * FROM members WHERE refby not in (SELECT id FROM members)
This should solve your problem
You can try this using not in:-
SELECT * FROM tbl.members WHERE refby not in (SELECT id FROM members)
This should be a LEFT JOIN, NOT IN is slow on large tables... assuming id and refid is an PRIMARY KEY or UNIQUE key (read unique within your dataset) then this query should return the same results.
SELECT
*
FROM
members members1
LEFT JOIN
members members2
ON members1.id = members2.refby
WHERE members2.id IS NULL
check the sqlfriddle http://sqlfiddle.com/#!2/05731/1
I have a table like this:
+----+---------+---------+
| Id | column1 | column2 |
+----+---------+---------+
| 1 | a | b |
| 2 | a | b |
+----+---------+---------+
and a query like this SELECT * FROM table WHERE id IN (1,2,3)
what query do I need to get a result like this(I need to get null values for nonexisten id's):
+----+---------+---------+
| Id | column1 | column2 |
+----+---------+---------+
| 1 | a | b |
| 2 | a | b |
| 3 | null | null |
+----+---------+---------+
EDIT
Thanks for the responses so far.
Is there a more 'dynamic way' to do this, the query above it's just an example.
In reality I need to check around 1000 id's!
You could use something like this:
SELECT ids.ID, your_table.column1, your_table.column2
FROM
(SELECT 1 as ID
UNION ALL SELECT 2
UNION ALL SELECT 3) ids left join your_table
on ids.ID = your_table.ID
First subquery returns each value you need in a different row. Then you can try to join each row with your_table. If you use a left join, all values from the first subquery are shown, and if there's a match with your_table, values from your_table are shown, otherwise you will get nulls.
That is not the way SQL works unfortunately. I would think it would be pretty trivial for your application to determine the differences between the id's it asked for and the id's returned.
So rather than hack or some weird query to mock up your result, why not have your application handle it?
I still can't understand though what the use case might be to where you would be querying rows on teh database by id's that may or may not exist.
This isn't solved, but I found out why: MySQL View containing UNION does not optimize well...In other words SLOW!
Original post:
I'm working with a database for a game. There are two identical tables equipment and safety_dep_box. To check if a player has a piece of equipment I'd like to check both tables.
Instead of doing two queries, I want to take advantage of the UNION functionality in MySQL. I've recently learned that I can create a VIEW. Here's my view:
CREATE VIEW vAllEquip AS SELECT * FROM equipment UNION SELECT * FROM safety_dep_box;
The view created just fine. However when I run
SELECT * FROM vAllEquip WHERE owner=<id>
The query takes forever, while independent select queries are quick. I think I know why, but I don't know how to fix it.
Thanks!
P.S. with Additional Information:
The two tables are identical in structure, but split because they are multi-100-million row tables.
The structure includes primary key on int id, multiple index on int owner.
What I don't understand is the speed difference between the following:
SELECT COUNT(*) FROM (SELECT * FROM equipment WHERE owner=1 UNION ALL SELECT * FROM safety_dep_box WHERE owner=1) AS uES;
0.42 sec
SELECT COUNT(*) FROM (SELECT * FROM equipment WHERE owner=1 UNION SELECT * FROM safety_dep_box WHERE owner=1) AS uES;
0.37 sec
SELECT COUNT(*) FROM vAllEquip WHERE owner=1;
aborted after 60 seconds
Version: 5.1.51
mysql> explain SELECT * FROM equipment UNION SELECT * FROM safety_dep_box;
+----+--------------+----------------+------+---------------+------+---------+------+---------+-------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+--------------+----------------+------+---------------+------+---------+------+---------+-------+
| 1 | PRIMARY | equipment | ALL | NULL | NULL | NULL | NULL | 1499148 | |
| 2 | UNION | safety_dep_box | ALL | NULL | NULL | NULL | NULL | 867321 | |
| NULL | UNION RESULT | <union1,2> | ALL | NULL | NULL | NULL | NULL | NULL | |
+----+--------------+----------------+------+---------------+------+---------+------+---------+-------+
with a WHERE clause
mysql> explain SELECT * FROM equipment WHERE owner=1 UNION ALL SELECT * FROM safety_dep_box WHERE owner=1
-> ;
+----+--------------+----------------+------+-----------------------+-------+---------+-------+------+-------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+--------------+----------------+------+-----------------------+-------+---------+-------+------+-------+
| 1 | PRIMARY | equipment | ref | owner,owner_2,owner_3 | owner | 4 | const | 1 | |
| 2 | UNION | safety_dep_box | ref | owner,owner_3 | owner | 4 | const | 1 | |
| NULL | UNION RESULT | <union1,2> | ALL | NULL | NULL | NULL | NULL | NULL | |
+----+--------------+----------------+------+-----------------------+-------+---------+-------+------+-------+
First off, you should probably be using UNION ALL instead of plain UNION. With plain UNION, the engine will try to de-duplicate your result set. That is likely the source of your problem.
Secondly, you'll need indexes on owner in both tables, not just one. And, ideally, they'll be integer columns.
Thirdly, Randolph is right that you should not be using "*" in your SELECT statement. List out all the columns you want included. That is especially important in a UNION because the columns must match up exactly and, if there's a disagreement in the column order in your two tables you may be forcing some type conversion to go on that is costing you some time.
Finally, the phrase "There are two identical tables" is almost always a tip-off that your database is not optimally designed. These should probably be a single table. To indicate ownership of an item, your safety_dep_box table should contain only the ownerID and itemID of the item (to relate equipment and players), and possibly an additional autonumbered integer key column.
First off, don't use SELECT * in views ever. It's lazy code. Secondly, without knowing what the base tables look like, we're even less likely to be able to help you.
The reason it takes forever is because it has to build the full result and then filter it. You'll want indexes on your owner fields, whatever they may be.