mysql - select distinct mutually exclusive (based on another column's value) rows

mysql - select distinct mutually exclusive (based on another column's value) rows - mysql

First off, I would like to say that if after reading the question, anyone has a suggestion on a more informative title for this question, please tell me, as I think mine is somewhat lacking, now, on to business...
Given this table structure:
+---------+-------------------------------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+---------+-------------------------------------+------+-----+---------+----------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| account | varchar(20) | YES | UNI | NULL | |
| domain | varchar(100) | YES | | NULL | |
| status | enum('FAILED','PENDING','COMPLETE') | YES | | NULL | |
+---------+-------------------------------------+------+-----+---------+----------------+
And this data:
+----+---------+------------------+----------+
| id | account | domain | status |
+----+---------+------------------+----------+
| 1 | jim | somedomain.com | COMPLETE |
| 2 | bob | somedomain.com | COMPLETE |
| 3 | joe | somedomain.com | COMPLETE |
| 4 | frank | otherdomain.com | COMPLETE |
| 5 | betty | otherdomain.com | PENDING |
| 6 | shirley | otherdomain.com | FAILED |
| 7 | tom | thirddomain.com | FAILED |
| 8 | lou | fourthdomain.com | COMPLETE |
+----+---------+------------------+----------+
I would like to select all domains which have a 'COMPLETE' status for all accounts (rows).
Any domains which have a row containing any value other then 'COMPLETE' for the status must not be returned.
So in the above example, My expected result would be:
+------------------+
| domain |
+------------------+
| somedomain.com |
| fourthdomain.com |
+------------------+
Obviously, I can achieve this by using a sub-query such as:
mysql> select distinct domain from test_table where status = 'complete' and domain not in (select distinct domain from test_table where status != 'complete');
+------------------+
| domain |
+------------------+
| somedomain.com |
| fourthdomain.com |
+------------------+
2 rows in set (0.00 sec)
This will work fine on our little mock-up test table, but in the real situation, the tables in question will be tens (or even hundreds) of thousands of rows, and I'm curious if there is some more efficient way to do this, as the sub-query is slow and intensive.

How about this:
select domain
from test_table
group by domain
having sum(case when status = 'COMPLETE'
then 0 else 1 end) = 0

I think this will work. Effectively just joins two basic queries together, then compares their count.
select
main.domain
from
your_table main
inner join
(
select
domain, count(id) as cnt
from
your_table
where
status = 'complete'
group by
domain
) complete
on complete.domain = main.domain
group by
main.domain
having
count(main.id) = complete.cnt
You should also ensure you have an index on domain as this relies on a join on that column.

Related

How to select query for certain value in MySQL query?

I have problem with my SQL query. I wanted to display data that their file condition = no and status condition not equal to pending-update.
This is my current table
| name | file | status |
----------------------------------
| willy | no | pending |
| ash | no | |
| wiki | no | pending |
| Windy | no | pending-update|
| wilma | no | |
-----------------------------
I would like to create a query that will display only this output
| name | file | status |
-----------------------------
| willy | no | pending |
| ash | no | |
| wiki | no | pending |
| wilma | no | |
-----------------------------
As the output data for ash and wilma the value of the column status is is null meaning blank attribute. That is what I want to achieve for this query. But I have problem of fetching the is null value. When I run my query the row that has is null status value did not get displayed in my desired output.
This is what I have tried
SELECT name,file, status FROM tbl_geq where file = 'no' AND (status NOT LIKE 'pending-update');
When I run this query I got this output
| name | file | status |
-----------------------------
| willy | no | pending |
| wiki | no | pending |
-----------------------------
How can I fix my query and achieve this output?
| name | file | status |
-----------------------------
| willy | no | pending |
| ash | no | |
| wiki | no | pending |
| wilma | no | |
-----------------------------

You must handle the NULL values explicitly:
AND (status IS NULL OR status <> 'pending-update')
An alternate (but less readable imo) is:
AND NOT (status <=> 'pending-update')
Keep in mind that SQL uses three-valued logic... a condition could be true, false or unknown. All comparisons involving NULL result in "unknown" which is not the same as false.

Sql query performance is varying though they are the same

There are 2 tables and their structure as below:
mysql> desc product;
+-------+-------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-------+-------------+------+-----+---------+-------+
| id | int(11) | NO | PRI | NULL | |
| brand | varchar(20) | YES | | NULL | |
+-------+-------------+------+-----+---------+-------+
2 rows in set (0.02 sec)
mysql> desc sales;
+-------------+-------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-------------+-------------+------+-----+---------+-------+
| id | int(11) | YES | | NULL | |
| yearofsales | varchar(10) | YES | | NULL | |
| price | int(11) | YES | | NULL | |
+-------------+-------------+------+-----+---------+-------+
3 rows in set (0.01 sec)
Here id is the foreign key.
And Queries are as follows:
1.
mysql> select brand,sum(price),yearofsales
from product p, sales s
where p.id=s.id
group by s.id,yearofsales;
+-------+------------+-------------+
| brand | sum(price) | yearofsales |
+-------+------------+-------------+
| Nike | 917504000 | 2012 |
| FF | 328990720 | 2010 |
| FF | 328990720 | 2011 |
| FF | 723517440 | 2012 |
+-------+------------+-------------+
4 rows in set (1.91 sec)
2.
mysql> select brand,tmp.yearofsales,tmp.sum
from product p
join (
select id,yearofsales,sum(price) as sum
from sales
group by yearofsales,id
) tmp on p.id=tmp.id ;
+-------+-------------+-----------+
| brand | yearofsales | sum |
+-------+-------------+-----------+
| Nike | 2012 | 917504000 |
| FF | 2011 | 328990720 |
| FF | 2012 | 723517440 |
| FF | 2010 | 328990720 |
+-------+-------------+-----------+
4 rows in set (1.59 sec)
Question is: Why the second query takes less time than the first one? I have executed it multiple times in different order as well.

You can check the execution plan for the two queries and the indexes on the two tables to see why one query takes more than the other. Also, you cannot run one simple test and trust the results, there are many factors that can impact the execution of queries, like the server being busy with something else when executing one query, so it runs slower. You'll have to run both queries a big number of times and then compare the averages.
However, it is highly recommended to use explicit joins instead of implicit joins:
SELECT brand, SUM(price), yearofsales
FROM product p
INNER JOIN sales s ON p.id = s.id
GROUP BY s.id, yearofsales;

Select a column from table based on other column values

I have a table in MySql and table name is logs
+---------------+---------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+---------------+---------------+------+-----+---------+-------+
| domain | varchar(50) | YES | MUL | NULL | |
| sid | varchar(100) | YES | MUL | NULL | |
| email | varchar(100) | YES | MUL | NULL | |
+---------------+---------------+------+-----+---------+-------+
The following are sample rows from the table
+------------+----------------+---------------
| sid | email | domain|
+------------------------------------+-------+
| 1 | | xxx123#yahoo.com | xxx |
| 2 | | xxx123#yahoo.com | xxx |
| 2 | | yyy123#yahoo.com | yyy |
| 2 | | yyy123#yahoo.com | yyy |
| 3 | | zzz123#yahoo.com | zzz |
| 4 | | qqq123#yahoo.com | qqq |
| 2 | | ppp123#yahoo.com | ppp |
+---+--------+-----------------------+-------+
I want a query something like
select * from logs
where sid IN (select sid from logs
where domain="xxx" AND email="xxx123#yahoo.com")
Desired output
+------------+-----------------------+--------
| sid | email | domain|
+------------------------------------+-------+
| 1 | | xxx123#yahoo.com | xxx |
| 2 | | xxx123#yahoo.com | xxx |
| 2 | | yyy123#yahoo.com | yyy |
| 2 | | yyy123#yahoo.com | yyy |
| 2 | | ppp123#yahoo.com | ppp |
+---+--------+-----------------------+-------+
I can do it using joins but is there any way to get results without using joins or any optimized version of this query

You can use where exists as
select l1.* from logs l1
where exists(
select 1 from logs l2
where l1.sid = l2.sid
and l2.domain = 'xxx'
and l2.email = 'xxx123#yahoo.com'
);

First get a proper id on those rows. Second have you tried it? it looks like it should work. I have no idea why you want that though.
If it actually doesn't work try this structure, could be faster:
SELECT *
FROM some_table
WHERE relevant_field IN
(
SELECT * FROM
(
SELECT relevant_field
FROM some_table
WHERE conditions
) AS subquery
)

Do you want the whole table as result or just one column?
If I get your question right I would simple use:
SELECT * FROM logs WHERE domain="xxx" AND email="xxx123#yahoo.com"
Or if you want only the sid just replace the * with sid.
And if all sid´s are numbers, why don´t you use int or something similar as column type?

It seems like you are doing something redundant just by looking at your request you seem to look for
select * from logs where domain="xxx" AND email="xxx123#yahoo.com"
I dont't know why you are using the first part of the SQL string since this is not a join from other sql tables.
Or am i missing something?

Querying a database of statistics to get counts of different events

I'm making a database of a soccer league that has these tables:
+---------------------+
| Tables_in_league484 |
+---------------------+
| player |
| statevent |
+---------------------+
18 rows in set (0.09 sec)
and the player table in question look like this,
mysql> desc player;
+-----------+-------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-----------+-------------+------+-----+---------+----------------+
| pid | int(11) | NO | PRI | NULL | auto_increment |
| lastname | varchar(55) | YES | | NULL | |
| firstname | varchar(85) | YES | | NULL | |
| dob | date | YES | | NULL | |
| posid | int(11) | YES | MUL | NULL | |
| tid | int(11) | YES | MUL | NULL | |
| shirtnum | int(11) | YES | | NULL | |
| email | varchar(85) | YES | | NULL | |
+-----------+-------------+------+-----+---------+----------------+
8 rows in set (0.09 sec)
posid is fk for position table;
tid is fk for team table;
mysql> desc statevent;
+--------+-------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+--------+-------------+------+-----+---------+----------------+
| eid | int(11) | NO | PRI | NULL | auto_increment |
| gid | int(11) | YES | MUL | NULL | |
| pid | int(11) | YES | MUL | NULL | |
| minute | int(11) | YES | | NULL | |
| typeid | int(11) | YES | | NULL | |
+--------+-------------+------+-----+---------+----------------+
5 rows in set (0.09 sec)
where the typeids are:
1 for shot
2 for save
3 for goal
4 for assist
how can i structure a mysql query that gives me a result that looks like this
+--------+------+------+-------+---------+----------------+
| Name | Team | Shots| Saves | Goals | Assists |
+--------+------+------+-------+---------+----------------+
| Nick | 1| 8| 0| 4| 1|
| Jeff | 4| 5| 0| 5| 6|
| Jim | 7| 7| 0| 6| 3|
+--------+------+------+-------+---------+----------------+
that ends after the 10th result? (limit 10)
I've been trying for hours and I'm knackered thinking about it. What do I count? What do I group by? Can I order by aliases?
EDIT
I failed to mention in my first edit that, while there are 18 helpful tables in this database, they are all empty (thus entirely useless) as they relate to the stat events.
They would have been wonderfully helpful.
However, I have to structure my query on this one table of statevents using only typeid. Is this possible?

Essentially, you're just trying to construct a simple PIVOT TABLE query. Personally I'd advocate just returning a GROUPed result set and handle the data display at the application level, but if you must do the pivoting in MySQL then it might look something like this - I've changed some column/table names to get you thinking a bit...
SELECT p.firstname
, p.team_id
, COUNT(CASE WHEN event_type_id = 1 THEN 'foo' END) Shots
, COUNT(CASE WHEN event_type_id = 2 THEN 'foo' END) Saves
, COUNT(CASE WHEN event_type_id = 3 THEN 'foo' END) Goals
, COUNT(CASE WHEN event_type_id = 4 THEN 'foo' END) Assists
FROM player p
JOIN stat_event e
ON e.player_id = p.player_id
GROUP
BY p.player_id;

You would have to join the player table with the other tables you need counts from (shots, saves, goals etc).
One you have the join in place, you would need to aggregate on player id, player name and team with the help of a group by clause.
Your final query will look something like this..
SELECT p.firstname, t.team, COUNT(sh.shots), COUNT(sa.saves), COUNT(g.goals),COUNT(a.assists)
FROM player p
INNER JOIN team t
ON p.tid = t.tid
....
GROUP BY p.pid, p.firstname, t.team
LIMIT 10
EDIT:
I am not a DB expert. I have one SUBOPTIMAL way of achieving this.
I would create a temporary table containing information of the form (it would have to contain pid and tid information too):
...
Nick Goals 13
Matt Saves 4
Nick Saves 11
...
This should be simple to achieve.
I would then use a SQL cursor to iterate over all distinct player ids and recover statistics from the temporary table we constructed above.

Temporary assigning NULL to a value for the purpose of sorting values in mysql

I am trying to get the correct formatting of results back from a Mysql query. When I ignore the NULL values, the formatting is correct, but when I allow null values to be included, my results are messed up.
I have the following query I am using:
select name,suite,webpagetest.id,MIN(priority) AS min_pri
FROM webpagetest,comparefileerrors
WHERE vco="aof" AND user="1" AND calibreversion="9"
AND webpagetest.id=comparefileerrors.id
AND comparefileerrors.priority IS NOT NULL
GROUP BY id
ORDER BY coalesce(priority,suite,name) ASC;
This returns the expected output:
+-----------------------------+-----------------------------+-------+---------+
| name | suite | id | min_pri |
+-----------------------------+-----------------------------+-------+---------+
| set_get_status | shortsRepairDB_2009.1_suite | 6193 | 0 |
| u2uDemo | shortsRepairDB_2009.1_suite | 6195 | 0 |
| change_sets | shortsRepairDB_2009.1_suite | 6194 | 0 |
| bz1508_SEGV_password | NULL | 6185 | 1 |
| assign_short_AND_user_info | shortsRepairDB_2009.1_suite | 6198 | 2 |
| bz1273_cmdline_execplussvdb | NULL | 6203 | 2 |
| bz1747_bad_lvsf | NULL | 36683 | 3 |
+-----------------------------+-----------------------------+-------+---------+
However, sometimes the priority values will not be set. If this is the case, I want the database to treat the priority as if it had an extremely high priority, so that the values with a null-priority are at the very bottom. I can not set the priority ahead of time (using a default value), but for the purposes of the sort, is it possible to do this?
Currently, if I issue the following command,
select name,suite,webpagetest.id,MIN(priority) AS min_pri
FROM webpagetest,comparefileerrors
WHERE vco="aof" AND user="1" AND calibreversion="9"
AND webpagetest.id=comparefileerrors.id
GROUP BY id
ORDER BY coalesce(priority,suite,name) ASC;
I get output like the following:
| name | suite | id | min_pri |
+-----------------------------+-------+-------+---------+
| bz1747_bad_lvsf | NULL | 36683 | 1 |
| NEC_Dragon.query | NULL | 36684 | NULL |
| avago_hwk_elam0_asic | NULL | 6204 | NULL |
| bz1273_cmdline_execplussvdb | NULL | 6203 | 2 |
| bz1491_query_server_crash | NULL | 6188 | NULL |
| bz1493_export_built_in_prop | NULL | 6186 | NULL |
+-----------------------------+-------+-------+---------+
6 rows in set (0.68 sec)
Here I have lost the formatting I had before. I would like the formatting to be as follows:
| name | suite | id | min_pri |
+-----------------------------+-------+-------+---------+
| bz1747_bad_lvsf | NULL | 36683 | 0 |
| NEC_Dragon.query | NULL | 36684 | 0 |
| avago_hwk_elam0_asic | NULL | 6204 | 1 |
| bz1273_cmdline_execplussvdb | NULL | 6203 | 2 |
| bz1491_query_server_crash | NULL | 6188 | NULL |
| bz1493_export_built_in_prop | NULL | 6186 | NULL |
+-----------------------------+-------+-------+---------+
6 rows in set (0.68 sec)
Hopefully I've explained this well enough that someone can understand what I want here.
Thanks for looking!

if you don't want to use sentinel value, i.e. ORDER BY COALESCE(priority, 99999); use:
select * from x
order by
case
when priority is not null then 1 /* non-nulls, first */
else 2 /* nulls last */
end,
priority
or you can take advantage of the fact that mysql boolean expression results to either 1 or 0:
select * from x
order by priority is null, priority
or if you're using postgresql:
select * from x order by priority nulls first
alternatively:
select * from x order by priority nulls last

Sounds like you want MIN(IFNULL(priority, 99999)). See the documentation for the IFNULL() function.

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008

mysql - select distinct mutually exclusive (based on another column's value) rows - mysql

How about this: select domain from test_table group by domain having sum(case when status = 'COMPLETE' then 0 else 1 end) = 0

Related

How to select query for certain value in MySQL query?

Sql query performance is varying though they are the same

Select a column from table based on other column values

Querying a database of statistics to get counts of different events

Temporary assigning NULL to a value for the purpose of sorting values in mysql

Categories

Resources