Comparing two SQL queries - mysql

I've got a MySQL database of all NCAA basketball tournament results. I'm looking at the "haves" and "have nots" of college hoops, and looking for who drops in and out of the "haves" list by examining NCAA tournament bids over time.
I've got a query that counts the number of NCAA appearances by each team for two sets of years. I want to compare the results for the two sets - seeing who dropped out and who dropped in from one year to the next.
For example, which teams made 6 of 10 NCAA tournaments between 1985-94, which made 6 between 1986-95, and what are the differences in the two lists. Here's what I have:
Select t1.Team AS "1994 Teams",t2.Team AS "1995 Teams"
FROM
(SELECT Count(DISTINCT TABLE_NAME.`Year`) AS 'Totals', TABLE_NAME.Team, TABLE_NAME.Current_Conference
FROM TABLE_NAME
WHERE TABLE_NAME.`Year` BETWEEN 1985 AND 1994
GROUP BY TABLE_NAME.Team HAVING Totals >= 6
ORDER BY TABLE_NAME.Team) AS t1,
(SELECT Count(DISTINCT TABLE_NAME.`Year`) AS 'Totals', TABLE_NAME.Team, TABLE_NAME.Current_Conference
FROM TABLE_NAME
WHERE TABLE_NAME.`Year` BETWEEN 1986 AND 1995
GROUP BY TABLE_NAME.Team HAVING Totals >= 6
ORDER BY TABLE_NAME.Team) AS t2
WHERE t1.Team = t2.Team
This returns (in this case) 32 records - all the teams that were in 6 of 10 NCAA tournaments in both 1985-94 and 1986-95. I'm trying to find the teams that are in one set and not the other.

One way of doing this is using the subquery in the WHERE part:
SELECT t1.Team
FROM (
SELECT COUNT(DISTINCT TABLE_NAME.`Year`) AS 'Totals',
TABLE_NAME.Team,
TABLE_NAME.Current_Conference
FROM TABLE_NAME
WHERE TABLE_NAME.`Year` BETWEEN 1985 AND 1994
GROUP BY TABLE_NAME.Team
HAVING Totals >= 6
ORDER BY TABLE_NAME.Team
) AS t1
WHERE t1.team NOT IN (
SELECT TABLE_NAME.Team
FROM TABLE_NAME
WHERE TABLE_NAME.`Year` BETWEEN 1986 AND 1995
GROUP BY TABLE_NAME.Team
HAVING COUNT(DISTINCT TABLE_NAME.`Year`) >= 6 )

Related

MySQL query to count zero value using group by in the same table

Here's my "customers" table:
To get number of enquiries per for a particular month and year, I'm using following query:
SELECT YEAR(customer_date) AS Year, MONTH(customer_date) AS Month, COUNT(customer_id) AS Count FROM customers WHERE customer_product = 6 GROUP BY YEAR(customer_date), MONTH(customer_date)
I get following result:
You can see that as there is no enquery in the April month, so no row fetched for month number 4. But I want 0 value in Count column if there is no record found in that particular month and year.
This is what I want:
One option uses a calendar table to represent all months and years, even those which do not appear in your data set:
SELECT
t1.year,
t2.month,
COUNT(c.customer_id) AS Count
FROM
(
SELECT 2017 AS year UNION ALL
SELECT 2018
) t1
CROSS JOIN
(
SELECT 1 AS month UNION ALL
SELECT 2 UNION ALL
SELECT 3 UNION ALL
SELECT 4 UNION ALL
SELECT 5 UNION ALL
SELECT 6 UNION ALL
SELECT 7 UNION ALL
SELECT 8 UNION ALL
SELECT 9 UNION ALL
SELECT 10 UNION ALL
SELECT 11 UNION ALL
SELECT 12
) t2
LEFT JOIN customers c
ON t1.year = YEAR(c.customer_date) AND
t2.month = MONTH(c.customer_date)
WHERE
c.customer_product = 6
GROUP BY
t1.year,
t2.month
ORDER BY
t1.year,
t2.month;
Note: The above query can probably be made faster by actually creating dedicated calendar tables in your MySQL schema.
The following index on the customers table might help:
CREATE INDEX idx ON customers(customer_product, customer_id);
This might make the join between the calendar tables and customers faster, assuming that the customer_product = 6 condition is restrictive.

Join a table to itself to display average value?

I have a list of game scores for various teams and various years.
Database looks like this:
id |team|year|week|points
1 | Wildcats|2015|1|43
2 | Wildcats|2015|2|50
I want to create a display that shows each team in the database, its total points scored for the year, and the league average points for that year.
So it might look like:
Wildcats 2015 387 44.3
etc.
I was trying this, but it's not working:
SELECT g1.year, g1.team, xyz.total
FROM game g1
join (SELECT avg(points) as total, year
FROM game g2) xyz on g1.year=xyz.year
group by g1.year, g1.team
You can use a correlated subquery to get average points per year:
SELECT team, year, SUM(points) AS totalPoints,
(SELECT AVG(points)
FROM game AS g2
WHERE g2.year = g1.year) AS avgPoints
FROM game AS g1
GROUP BY team, year
Demo here
or, with joining a derived table:
SELECT g1.team, g1.year, SUM(g1.points) AS totalPoints,
g2.avgPoints
FROM game AS g1
JOIN (SELECT AVG(points) AS avgPoints, year
FROM game
GROUP BY year
) AS g2 ON g1.year = g2.year
GROUP BY team, year
Demo here
May this query is what you looking for:
select team,year,sum(points)as total,cast(avg(points)as decimal(10,1))
as average from your_table_name group by team,year

Using a sum of values as a condition (SQL query)

I have a table that looks roughly like this
Year Species Count
1979 A 0
1980 A 10
1981 A 4
1982 A 3
1979 B 0
1980 B 1
1981 B 2
1982 B 3
1979 C 9
1980 C 14
1981 C 2
1982 C 1
What i want is to return all Year, Species, Count for those species that have a total count (as in summed over all years) of 10 or more. so for a total count of 20 i would want it to just return
1979 C 9
1980 C 14
1981 C 2
1982 C 1
i played around with having but havent really gotten anything useful (total SQL beginner)
In MySQL, you can do this using aggregation and a join:
select t.*
from table t join
(select species, count(*) as cnt
from table
group by species
) s
on t.species = s.species
where s.cnt >= 10;
This is the easiesy. You already have the counts. Group on species and filter table on the results of the subquesy. You can get the same functionality with an exists or a join also.
SELECT
[YEAR]
,SPECIES
,[COUNT]
FROM TABLE
WHERE SPECIES IN (
SELECT SPECIES
FROM TABLE
GROUP BY SPECIES
HAVING SUM([COUNT]) > 20)
)
Adding some addtional explanation for BootstrapBill
Group by "makes multiple sets" for each unique value of the GROUP BY column. That allows the aggregate function SUM() act on only one set of the GROUP BY values at a time. HAVING is sort of like a WHERE clause for the GROUP BY statement that allows you to apply a predicate. The only fields allowed to be returned by a GROUP BY are the grouped column itself and the results of any aggregate function(s), you need to join back to or filter the original set to get the other columns your are targeting in the query.
And I apoligze, I did not see where the OP stated this was for MySql. The core concept is the same so I am leaving the answer. [] are MS SQL syntax for escaping the keywords COUNT and YEAR.
You'll want to use GROUP BY with the SUM() aggregate function and HAVING clause (similar to WHERE, but for groups instead of rows), combined with a self-join:
SELECT t1.`Year`, t1.`Species`, t1.`Count`
FROM mytable t1 INNER JOIN (
SELECT `Species`, SUM(`Count`)
FROM mytable
GROUP BY `Species`
HAVING SUM(`Count`) >= 20
) t2
ON t1.`Species` = t2.`Species`

MySQL query problems with combined SUM

I have three tables here, that I'm trying to do a tricky combined query on.
Table 1(teams) has Teams in it:
id name
------------
150 LA Lakers
151 Boston Celtics
152 NY Knicks
Table 2(scores) has scores in it:
id teamid week score
---------------------------
1 150 5 75
2 151 5 95
3 152 5 112
Table 3(tickets) has tickets in it
id teamids week
---------------------
1 150,152,154 5
2 151,154,155 5
I have two queries that I'm trying to write
Rather than trying to sum these each time i query the tickets, I've added a weekly_score field to the ticket. The idea being, any time a new score is entered for the team, I could take that teams id, get all tickets that have that team / week combo, and update them all based on the sum of their team scores.
I've tried the following to get the results i'm looking for (before I try and update them):
SELECT t.id, t.teamids, (
SELECT SUM( s1.score )
FROM scores s1
WHERE s1.teamid
IN (
t.teamids
)
AND s1.week =11
) AS score
FROM tickets t
WHERE t.week =11
AND (t.teamids LIKE "150,%" OR t.teamids LIKE "%,150")
Not only is the query slow, but it also seems to not return the sum of the scores, it just returns the first score in the list.
Any help is greatly appreciated.
If you are going to match, you'll need to accommodate for the column only having one team id. Also, you'll need to LIKE in your SELECT sub query.
SELECT t.id, t.teamids, (
SELECT SUM( s1.score )
FROM scores s1
WHERE
(s1.teamid LIKE t.teamids
OR CONCAT("%,",s1.teamid, "%") LIKE t.teamids
OR CONCAT("%",s1.teamid, ",%") LIKE t.teamids
)
AND s1.week =11
) AS score
FROM tickets t
WHERE t.week =11
AND (t.teamids LIKE "150,%" OR t.teamids LIKE "%,150" OR t.teamids LIKE "150")
You don't need SUM function here ? The scores table already has it? And BTW, avoid subqueries, try the left join (or left outer join depending on your needs).
SELECT t.id, t.name, t1.score, t2.teamids
FROM teams t
LEFT JOIN scores t1 ON t.id = t1.teamid AND t1.week = 11
LEFT JOIN tickets t2 ON t2.week = 11
WHERE t2.week = 11 AND t2.teamids LIKE "%150%"
Not tested.
Well not the most elegant query ever, but it should word:
SELECT
tickets.id,
tickets.teamids,
sum(score)
FROM
tickets left join scores
on concat(',', tickets.teamids, ',') like concat('%,', scores.teamid, ',%')
WHERE tickets.week = 11 and concat(',', tickets.teamids, ',') like '%,150,%'
GROUP BY tickets.id, tickets.teamids
or also this:
SELECT
tickets.id,
tickets.teamids,
sum(score)
FROM
tickets left join scores
on FIND_IN_SET(scores.teamid, tickets.teamids)>0
WHERE tickets.week = 11 and FIND_IN_SET('150', tickets.teamids)>0
GROUP BY tickets.id, tickets.teamids
(see this question and the answers for more informations).

Get date when two things appear at the same time (mysql query)

Is there a sql query that can generate the date when 2 things appear together?
I mean, let's say I have a table consists of bus schedule. Then, I have bus A and B. Bus A will operate on 22 May, 24 May, and 25 May while B operates on 22 May, 24 May and 26 May. I want to get the most recent date that 2 buses appear together which is 24 May.
To see those that both buses share:
SELECT t.date
FROM YOUR_TABLE t
WHERE t.bus IN ('A', 'B')
GROUP BY t.date
HAVING COUNT(DISTINCT t.bus) = 2
To see the most recent date that both buses share:
SELECT t.date
FROM YOUR_TABLE t
WHERE t.bus IN ('A', 'B')
GROUP BY t.date
HAVING COUNT(DISTINCT t.bus) = 2
ORDER BY t.date DESC
LIMIT 1
Assuming you have a table named bus_schedule that contains a bus_name and bus_date field, something like this should work:
select bus_schedule_a.bus_date
from bus_schedule bus_schedule_a
inner join bus_schedule bus_schedule_b
on bus_schedule_a.bus_date = bus_schedule_b.bus_date
and bus_schedule_a.bus_name <> bus_schedule_b.bus_name
order by bus_schedule_a.bus_date desc
limit 1