Let's say that I have a table of race results. The table consists of seven columns as follows: Date ( MySql Date format of xxxx-xx-xx ), and one column each for the names of the top six finishers named First, Second, Third, Fourth, Fifth, and Sixth. I have a several sets of results and maybe 100 or so different names in the various finisher columns. I need a query that would allow me to list each person whose name has appeared in any of the finisher columns ( First, Second, Third, Fourth, Fifth, Sixth ) along with only the most recent date that their name appeared. I do NOT need separate results based on finish place, so I need all six of the finisher columns lumped together. Most of the names will appear on dozens of different dates, but I only need the most recent date that each name appeared. Ideally the result would generate a list of each name and their most recent finish date, sorted from least recent to most recent. I tried to create a fiddle to demonstrate this but for whatever reason I could not get the date to work correctly in the fiddle. Anyway, anyone who can offer even a shred of help on this would be greatly appreciated.
SELECT <table>.date, <table>.first as name FROM <table> GROUP BY name
UNION DISTINCT
SELECT <table>.date, <table>.second as name FROM <table> GROUP BY name
UNION DISTINCT
SELECT <table>.date, <table>.third as name FROM <table> GROUP BY name
UNION DISTINCT
SELECT <table>.date, <table>.fourth as name FROM <table> GROUP BY name
UNION DISTINCT
SELECT <table>.date, <table>.fifth as name FROM <table> GROUP BY name
UNION DISTINCT
SELECT <table>.date, <table>.sixth as name FROM <table> GROUP BY name
ORDER BY date
This gave me exactly the results I needed. I basically just used "name" to catch the finishers, and "MAX(Date) as last" to pick out only the most recent showing for each "name" when I queried all six columns.
SELECT name, MAX(Date) as last
FROM (
SELECT first name,Date FROM Results
UNION ALL
SELECT second, Date FROM Results
UNION ALL
SELECT third, Date FROM Results
UNION ALL
SELECT fourth, Date FROM Results
UNION ALL
SELECT fifth, Date FROM Results
UNION ALL
SELECT sixth, Date FROM Results) Q
GROUP BY name
ORDER BY last ASC;
select t.first, a.date from (select distinct(first) from race union select distinct(second) from race union select distinct(third) from race union select distinct(fourth) from race union select distinct(fifth) from race union select distinct(sixth) from race) as t, race a where t.first in (a.first, a.second, a.third, a.fourth, a.fifth, a.sixth)
Related
I want to select from two identical tables using UNION ALL and GROUP BY. However, the Group BY doesn't work. Here is my query:
SELECT type , COUNT(subscription.id) as number ,SUM(subscription.amount) as total
FROM subscription
WHERE DATE(subscription.timestamp) BETWEEN '2022-10-18' AND '2022-10-18'
UNION ALL
SELECT type , COUNT(archive_subscription.id) as number ,SUM(archive_subscription.amount) as total
FROM archive_subscription
WHERE DATE(archive_subscription.timestamp) BETWEEN '2022-10-18' AND '2022-10-18'
GROUP BY type
The result is like the following:
type
number
amount
1
2
180000
1
1
80000
What I want to do is two merge both table using GROUP BY but it won't work:
type
number
amount
1
3
260000
Please, any suggestions? Thanks
The first subquery in your union is missing its GROUP BY clause. Instead, try taking a union first and then aggregate:
SELECT type, COUNT(*) AS number, SUM(amount) AS total
FROM
(
SELECT type, amount
FROM subscription
WHERE timestamp BETWEEN '2022-10-18' AND '2022-10-18'
UNION ALL
SELECT type, amount
FROM archive_subscription
WHERE timestamp BETWEEN '2022-10-18' AND '2022-10-18'
) t
GROUP BY type;
Note that the ranges in your WHERE clauses are trivial. If you really want to restrict to a single date, just use WHERE timestamp = '2022-10-18' instead.
I have a subquery that aggregates some UNION ALL selects. Over that I prepare the SELECT to create cross-tab and limit it to let's say 20. I would like to be able to retrieve the total COUNT of sub query results before I am limiting them in main query. This is for the purpose of trying to build a pagination that receives the total number of records and then the specific page record grid.
Sample query:
SELECT
name,
sumIf(metric_value, metric_name = 'data') AS data,
sumif(....
FROM
(SELECT
name, metric_name, SUM(metric_value) as metric_value
FROM
(SELECT
name, 'data' AS metric_name, SUM(data) AS metric_value
FROM
table
WHERE
date > '2017-01-01 00:00:00'
GROUP BY
name
UNION ALL
SELECT
name, 'data' AS metric_name, SUM(data) AS metric_value
FROM
table2
WHERE
date > '2017-01-01 00:00:00'
GROUP BY
name
UNION ALL
SELECT
name, 'data' AS metric_name, SUM(data) AS metric_value
FROM
table3
WHERE
date > '2017-01-01 00:00:00'
GROUP BY
name
UNION ALL
.
.
.)
GROUP BY
name, metric_name)
GROUP BY
name
ORDER BY
name ASC
LIMIT 0,20;
The first subselect returns tons of data, so I thought I can count it and return as one column value, or row and it would propagate to main select that limits 20 results. Because I need to know the entire set of results but don;t want to call the same query twice without limit and with limit just to get COUNT. There are at least 12 UNION ALL third level sub selects, so why waste resources. I am looking to try generic SQL solutions not necessarily related to ClickHouse
I was thinking of using count(*) OVER (), however that is not supported, so if thats only option I know I need to run query twice.
The first thing that one should mention is that nobody is usually interested in the exact number of pages on a query. It can be easily estimated and almost no one will care how exact is the estimation. However, if you have a link to the last page in your GUI, people will often click to link just to see whether it works.
Nevertheless, there are cases when an analyst should visit all the pages, and then the GUI should display the exact amount of work. A good news is that in that latter case, a better strategy is to cache a snapshot of the whole results table and counting the rows in the table becomes not a problem anymore.
I mean, it makes sense to discuss with the customers whether they really need it, because unneeded full scans many times per day may have effect on the database load and billing sums.
Anyway, if you still need to estimate the number of rows, you can simplify the query just to count the number of rows. As I understand this is something like:
SELECT SUM(cnt) as row_count
FROM (
SELECT COUNT(DISTINCT name) as cnt FROM table1 WHERE date > ...
UNION ALL
SELECT COUNT(DISTINCT name) as cnt FROM table2 WHERE date > ...
...
) as counts;
or if data is a constant metric name
SELECT COUNT(DISTINCT name) as row_count
FROM (
SELECT DISTINCT name FROM table1 WHERE date > ...
UNION ALL
SELECT DISTINCT name FROM table2 WHERE date > ...
...
) as names;
I have a table with with 2 unique linked table ids.
I get the results I want with GROUP BY but when I count I only get the number of each group.
When I do:
SELECT COUNT(id) FROM my_table GROUP BY first_linked_table_id, second_linked_table_id
I get as results 1, 2, 3, 1 but I want 4 as a result.
I tried DISTINCT but I think that only works with one column
Your requirement is to get count of number of groups. So we need two operations-
Group(inner query)
Count(outer query)
Following query will do precisely that:
SELECT COUNT(*)
FROM
(
SELECT COUNT(id)
FROM my_table
GROUP BY first_linked_table_id,
second_linked_table_id
) t
If you want to count the rows, I think you're going to need a subquery. Something like this:
SELECT COUNT(*) FROM (
SELECT COUNT(id) FROM my_table GROUP BY first_linked_table_id, second_linked_table_id
);
I'm using the UNION operator to select results from two different tables. I want results from the first table result to come before those from the second table.
For example: I have the tables customer_coupons and segment_coupons. Both tables have a column named coupon_id. When I run a query involving a UNION of these two tables, it returns the correct records, but they are not the order I want: It gives me the coupon_ids of both tables mixed in ascending order, but I want to show ALL coupon_ids of the first table and then ALL coupon_ids of the second table.
Here's the query as it currently exists:
SELECT coupon_id
FROM customer_coupons
UNION
SELECT coupon_id
FROM segment_coupons;
How can I change this so that all results from the first half of the query come before all results of the second half?
Put in a fixed table-identifying field:
(SELECT 1 AS source_table, coupon_id
FROM customer_coupons)
UNION ALL
(SELECT 2 AS sourcE_table, coupon_id
FROM segment_coupons)
ORDER BY source_table, coupon_id
Note the brackets around the individual queries. This forces MySQL to apply the order by to the result of the union, not to the 2 sub-query.
SELECT * FROM (
SELECT coupon_id, 1 as myorder
FROM customer_coupons
UNION
SELECT coupon_id 2 as myorder
FROM segment_coupons)
Order by myorder
I have a voting application that writes values to a mysql db table. It is a preference/weighted voting system so people choose a first option, second option, and third option. These all go into separate fields in the table. I'm looking for a way to write a query that will assign numerical values to the responses (3 for a first response, 2 for a second, 1 for a first) and then display the value with the summed score. I've been able to do this for total number of votes
select count(name) as votes,name
from (select 1st_option as name from votes
union all
select 2nd_option from votes
union all
select 3rd_option from votes) as tbl
group by name
having count(name) > 0
order by 1 desc;
but haven't quite figured out how to assign values to response in each column and then pull them together. Any help is much appreciated. Thanks!
You could do something like this:
select sum(score) as votes,name
from (select 1st_option as name, 3 as score from votes
union all
select 2nd_option as name, 2 as score from votes
union all
select 3rd_option as name, 1 as score from votes) as tbl
group by name;