I'm just beginner in SQL, so please don't worry :)
I have table where I store dictionary word translates. The table have columns id, la (language), lb (target language), wa (original word), wb (translated word) and some more insignificant columns :). I want to show a table overview, where will be a list of all languages with count of words in each language plus row with SUM as a language with total words count (all counts of only distinct words).
I wrote this query:
SELECT `lng`, COUNT(`word`) AS `count`
FROM (
SELECT DISTINCT *
FROM (
SELECT DISTINCT `la` AS `lng`, `wa` AS `word`
FROM `dict_trans`
UNION ALL
SELECT DISTINCT `lb` AS `lng`, `wb` AS `word`
FROM `dict_trans`
) AS `tbla`
) AS `tblb` GROUP BY `lng`
UNION ALL
SELECT 'sum' AS `lng`, COUNT(`word`) AS `count`
FROM (
SELECT DISTINCT *
FROM (
SELECT DISTINCT `la` AS `lng`, `wa` AS `word`
FROM `dict_trans`
UNION ALL
SELECT DISTINCT `lb` AS `lng`, `wb` AS `word`
FROM `dict_trans`
) AS `tblc`
) AS `tbld` ORDER BY `count` DESC
But I think it's very silly and performance unfriendly doing one subquery more times.
SELECT DISTINCT *
FROM (
SELECT DISTINCT `la` AS `lng`, `wa` AS `word`
FROM `dict_trans`
UNION ALL
SELECT DISTINCT `lb` AS `lng`, `wb` AS `word`
FROM `dict_trans`
) AS `tbla`
I tried in second part of code to pass reference to the table from first part:
SELECT `lng`, COUNT(`word`) AS `count`
FROM (
SELECT DISTINCT *
FROM (
SELECT DISTINCT `la` AS `lng`, `wa` AS `word`
FROM `dict_trans`
UNION ALL
SELECT DISTINCT `lb` AS `lng`, `wb` AS `word`
FROM `dict_trans`
) AS `tbla`
) AS `tblb` GROUP BY `lng`
UNION ALL
SELECT 'sum' AS `lng`, COUNT(`word`) AS `count`
FROM `tblb`
ORDER BY `count` DESC
But error was thrown (#1146 - Table 'db.tblb' doesn't exist).
It's possible to solve this problem without of creating temporary tables?
You don't need all the complexity of your query. UNION means UNION DISTINCT so there is no need for the 3-level nesting and the extra count can be done with the WITH ROLLUP modifier:
SELECT lng AS language,
COUNT(word) AS WordCount
FROM
( SELECT la AS lng, wa AS word
FROM dict_trans
UNION -- DISTINCT is the default
SELECT lb, wb
FROM dict_trans
) AS t
GROUP BY lng
WITH ROLLUP ;
And if you want sorting, you'll need another nesting:
SELECT language, WordCount
FROM
( SELECT COALESCE(lng, 'Total') AS language,
COUNT(word) AS WordCount
FROM
( SELECT la AS lng, wa AS word
FROM dict_trans
UNION -- DISTINCT is the default
SELECT lb, wb
FROM dict_trans
) AS t
GROUP BY lng
WITH ROLLUP
) AS tmp
ORDER BY CASE WHEN language = 'Total' THEN 1 ELSE 0 END,
WordCount DESC ;
The most expensive part of this query will be the UNION (because you need DISTINCT) and the GROUP BY operations. If you want efficiency, it would be much better to have a separate table with all distinct language and word combinations and then group by on that table (no union would be needed). And the dict_trans would be a "junction" table and you'd have 2 foreign keys, pointing to that new distinct languages-words table.
Related
I need to select all branches from branches table grouped by store_id order by distance
In other words i need to get the nearest branch of each store
select distinct `branch_id`, `store_id`, `branch_name`, distance
from `store_branches`
groupBy store_id having MIN(distance)
orderBy `distance` asc
MIN(distance) couse the query to return an empty result
I used having because orderBy is not doing the job ,because groupBy is applied before orderBy
With a correlated subquery:
select sb.`branch_id`, sb.`store_id`, sb.`branch_name`, sb.distance
from `store_branches` sb
where sb.`distance` = (select min(distance) from `store_branches` where `store_id` = sb.`store_id`)
or with NOT EXISTS:
select sb.`branch_id`, sb.`store_id`, sb.`branch_name`, sb.distance
from `store_branches` sb
where not exists (
select 1 from `store_branches`
where `store_id` = sb.`store_id` and `distance` < sb.`distance`
)
For MySql 8.0 you could use RANK() window function:
select sb.`branch_id`, sb.`store_id`, sb.`branch_name`, sb.distance
from (
select *, rank() over (partition by `store_id` order by `distance`) rnk
from `store_branches`
) sb
where sb.rnk = 1
I am currently have this query:
SET sql_mode='';
SELECT `id`, `url`,`number`,`abbrev`,`content`,`label`,`hier-1` FROM `leganalyse_unitsub_2020` WHERE `id` IN (SELECT MAX(`id`) FROM `leganalyse_unitsub_2020` GROUP BY `url`)
The subquery in the following clause:
SELECT MAX(`id`) FROM `leganalyse_unitsub_2020` GROUP BY `url`
returns a set of integer values.
For each integer value returned x, I'd like to also include x-1, x-2, x-3, x-4 and x-5.
So something like:
SELECT `id`, `url`,`number`,`abbrev`,`content`,`label`,`hier-1` FROM `leganalyse_unitsub_2020` WHERE `id` IN
(
SELECT MAX(`id`) FROM `leganalyse_unitsub_2020` GROUP BY `url`
UNION
SELECT MAX(`id`)-1 FROM `leganalyse_unitsub_2020` GROUP BY `url`
UNION
SELECT MAX(`id`)-2 FROM `leganalyse_unitsub_2020` GROUP BY `url`
UNION
SELECT MAX(`id`)-3 FROM `leganalyse_unitsub_2020` GROUP BY `url`
UNION
SELECT MAX(`id`)-4 FROM `leganalyse_unitsub_2020` GROUP BY `url`
)
But I am not sure if this is correct.
What query will do what I have described?
Cross join the subquery with a query that returns 0, 1, 2, 3, 4 to get all the possible values:
SELECT id, url,number, abbrev, content, label, hier-1
FROM leganalyse_unitsub_2020 WHERE id IN (
SELECT g.id - t.id
FROM (SELECT 0 id UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4) t
CROSS JOIN (SELECT MAX(id) id FROM leganalyse_unitsub_2020 GROUP BY url) g
)
This code is based on your last query.
If you want the values to be like x-1, x-2, x-3, x-4 and x-5 as you mention in the question the you should use this subquery:
SELECT 1 id UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5
Let's say I was looking for the second most highest record.
Sample Table:
CREATE TABLE `my_table` (
`id` int(2) NOT NULL AUTO_INCREMENT,
`name` varchar(255) NOT NULL,
`value` int(10),
PRIMARY KEY (`id`)
);
INSERT INTO `my_table` (`id`, `name`, `value`) VALUES (NULL, 'foo', '200'), (NULL, 'bar', '100'), (NULL, 'baz', '0'), (NULL, 'quux', '300');
The second highest value is foo. How many ways can you get this result?
The obvious example is:
SELECT name FROM my_table ORDER BY value DESC LIMIT 1 OFFSET 1;
Can you think of other examples?
I was trying this one, but LIMIT & IN/ALL/ANY/SOME subquery is not supported.
SELECT name FROM my_table WHERE value IN (
SELECT MIN(value) FROM my_table ORDER BY value DESC LIMIT 1
) LIMIT 1;
Eduardo's solution in standard SQL
select *
from (
select id,
name,
value,
row_number() over (order by value) as rn
from my_table t
) t
where rn = 1 -- can pick any row using this
This works on any modern DBMS except MySQL. This solution is usually faster than solutions using sub-selects. It also can easily return the 2nd, 3rd, ... row (again this is achievable with Eduardo's solution as well).
It can also be adjusted to count by groups (adding a partition by) so the "greatest-n-per-group" problem can be solved with the same pattern.
Here is a SQLFiddle to play around with: http://sqlfiddle.com/#!12/286d0/1
This only works for exactly the second highest:
SELECT * FROM my_table two
WHERE EXISTS (
SELECT * FROM my_table one
WHERE one.value > two.value
AND NOT EXISTS (
SELECT * FROM my_table zero
WHERE zero.value > one.value
)
)
LIMIT 1
;
This one emulates a window function rank() for platforms that don't have them. It can also be adapted for ranks <> 2 by altering one constant:
SELECT one.*
-- , 1+COALESCE(agg.rnk,0) AS rnk
FROM my_table one
LEFT JOIN (
SELECT one.id , COUNT(*) AS rnk
FROM my_table one
JOIN my_table cnt ON cnt.value > one.value
GROUP BY one.id
) agg ON agg.id = one.id
WHERE agg.rnk=1 -- the aggregate starts counting at zero
;
Both solutions need functional self-joins (I don't know if mysql allows them, IIRC it only disallows them if the table is the target for updates or deletes)
The below one does not need window functions, but uses a recursive query to enumerate the rankings:
WITH RECURSIVE agg AS (
SELECT one.id
, one.value
, 1 AS rnk
FROM my_table one
WHERE NOT EXISTS (
SELECT * FROM my_table zero
WHERE zero.value > one.value
)
UNION ALL
SELECT two.id
, two.value
, agg.rnk+1 AS rnk
FROM my_table two
JOIN agg ON two.value < agg.value
WHERE NOT EXISTS (
SELECT * FROM my_table nx
WHERE nx.value > two.value
AND nx.value < agg.value
)
)
SELECT * FROM agg
WHERE rnk = 2
;
(the recursive query will not work in mysql, obviously)
You can use inline initialization like this:
select * from (
select id,
name,
value,
#curRank := #curRank + 1 AS rank
from my_table t, (SELECT #curRank := 0) r
order by value desc
) tb
where tb.rank = 2
SELECT name
FROM my_table
WHERE value < (SELECT max(value) FROM my_table)
ORDER BY value DESC
LIMIT 1
SELECT name
FROM my_table
WHERE value = (
SELECT min(r.value)
FROM (
SELECT name, value
FROM my_table
ORDER BY value DESC
LIMIT 2
) r
)
LIMIT 1
Is there any way I can change this SQL so the terms are defined only once?
SQLFiddle.
SELECT sum(score) score, title
FROM
(
SELECT
score,
title
FROM
(
SELECT 3 score, 'a railway employee' term UNION ALL
SELECT 2 score, 'a railway' term UNION ALL
SELECT 2 score, 'railway employee' term UNION ALL
SELECT 1 score, 'a' term UNION ALL
SELECT 1 score, 'railway' term UNION ALL
SELECT 1 score, 'employee' term
) terms
INNER JOIN tableName ON title LIKE concat('%', terms.term, '%')
UNION ALL
SELECT
score*1.1 score,
title
FROM
(
SELECT 3 score, 'a railway employee' term UNION ALL
SELECT 2 score, 'a railway' term UNION ALL
SELECT 2 score, 'railway employee' term UNION ALL
SELECT 1 score, 'a' term UNION ALL
SELECT 1 score, 'railway' term UNION ALL
SELECT 1 score, 'employee' term
) terms
INNER JOIN tableName ON summary LIKE concat('%', terms.term, '%')
) AS t
GROUP BY title
ORDER BY score DESC
If you don't want to write them out twice, why not just create a table that stores the terms and the scores and then you join on the table:
create table terms
(
term varchar(50),
score int
);
insert into terms values
('a railway employee', 3),
('a railway', 2),
('railway employee', 2),
('a', 1),
('railway', 1),
('employee', 1);
Then the query will be:
SELECT sum(score) score, title
FROM
(
SELECT score,title
FROM terms
INNER JOIN tableName ON title LIKE concat('%', terms.term, '%')
UNION ALL
SELECT score*1.1 score, title
FROM terms
INNER JOIN tableName ON summary LIKE concat('%', terms.term, '%')
) AS t
GROUP BY title
ORDER BY score DESC;
See SQL Fiddle with Demo
Note: I do advise that you put the values into their own table. Just sticking them in the query text is probably not ideal. But the queries I present below will work equally well with a real table as with a hard-coded derived table.
Here's one way:
SELECT
sum(score * multiplier) score,
title
FROM
(
SELECT 3 score, 'a railway employee' term UNION ALL
SELECT 2, 'a railway' UNION ALL
SELECT 2, 'railway employee' UNION ALL
SELECT 1, 'a' UNION ALL
SELECT 1, 'railway' UNION ALL
SELECT 1, 'employee'
) terms
CROSS JOIN (
SELECT 'title' which, 1 multiplier
UNION ALL SELECT 'summary', 1.1
) X
INNER JOIN tableName ON
CASE
X.which WHEN 'title' THEN title
WHEN 'summary' THEN summary
END
LIKE concat('%', terms.term, '%')
GROUP BY title
ORDER BY score DESC
;
See a Live Demo at SQL Fiddle
And here's another way that is basically the same but shuffled around a little bit:
SELECT
sum(terms.score * T.multiplier) score,
title
FROM
(
SELECT 3 score, 'a railway employee' term UNION ALL
SELECT 2, 'a railway' UNION ALL
SELECT 2, 'railway employee' UNION ALL
SELECT 1, 'a' UNION ALL
SELECT 1, 'railway' UNION ALL
SELECT 1, 'employee'
) terms
INNER JOIN (
SELECT
title,
CASE
X.which WHEN 'title' THEN title
WHEN 'summary' THEN summary
END comparison,
X.multiplier
FROM
tableName
CROSS JOIN (
SELECT 'title' which, 1 multiplier
UNION ALL SELECT 'summary', 1.1
) X
) T ON T.comparison LIKE concat('%', terms.term, '%')
GROUP BY title
ORDER BY score DESC
;
See a Live Demo at SQL Fiddle
And finally, one more way:
SELECT *
FROM
(
SELECT
sum(
terms.score * (
CASE WHEN T.title LIKE concat('%', terms.term, '%') THEN 1 ELSE 0 END
+ CASE WHEN T.summary LIKE concat('%', terms.term, '%') THEN 1.1 ELSE 0 END
)
) score,
title
FROM
tableName T
CROSS JOIN (
SELECT 3 score, 'a railway employee' term UNION ALL
SELECT 2, 'a railway' UNION ALL
SELECT 2, 'railway employee' UNION ALL
SELECT 1, 'a' UNION ALL
SELECT 1, 'railway' UNION ALL
SELECT 1, 'employee'
) terms
GROUP BY title
ORDER BY score DESC
) Z
WHERE
Z.score > 0
;
See a Live Demo at SQL Fiddle
Also, if MySQL has something like CROSS APPLY that will let the CROSS JOIN have an outer reference, then some of this becomes easier (e.g., the first query could lose the CASE statement completely).
Maybe I don't understand the question...dinner...wine...etc ... but can you use multiple columns?
select animal, score + score2 as combinedScore
from
(
select 'cat' as animal, 1 as score, 1 * 1.1 as score2
union
select 'dog' as animal, 2 as score, 2 * 2.2 as score2
) as X
I have a table called receiving with 4 columns:
id, date, volume, volume_units
The volume units are always stored as a value of either "Lbs" or "Gals".
I am trying to write an SQL query to get the sum of the volumes in Lbs and Gals for a specific date range. Something along the lines of: (which doesn't work)
SELECT sum(p1.volume) as lbs,
p1.volume_units,
sum(p2.volume) as gals,
p2.volume_units
FROM receiving as p1, receiving as p2
where p1.volume_units = 'Lbs'
and p2.volume_units = 'Gals'
and p1.date between "2012-01-01" and "2012-03-07"
and p2.date between "2012-01-01" and "2012-03-07"
When I run these queries separately the results are way off. I know the join is wrong here, but I don't know what I am doing wrong to fix it.
SELECT SUM(volume) AS total_sum,
volume_units
FROM receiving
WHERE `date` BETWEEN '2012-01-01'
AND '2012-03-07'
GROUP BY volume_units
You can achieve this in one query by using IF(condition,then,else) within the SUM:
SELECT SUM(IF(volume_units="Lbs",volume,0)) as lbs,
SUM(IF(volume_units="Gals",volume,0)) as gals,
FROM receiving
WHERE `date` between "2012-01-01" and "2012-03-07"
This only adds volume if it is of the right unit.
This query will display the totals for each ID.
SELECT s.`id`,
CONCAT(s.TotalLbsVolume, ' ', 'lbs') as TotalLBS,
CONCAT(s.TotalGalVolume, ' ', 'gals') as TotalGAL
FROM
(
SELECT `id`, SUM(`volume`) as TotalLbsVolume
FROM Receiving a INNER JOIN
(
SELECT `id`, SUM(`volume`) as TotalGalVolume
FROM Receiving
WHERE (volume_units = 'Gals') AND
(`date` between '2012-01-01' and '2012-03-07')
GROUP BY `id`
) b ON a.`id` = b.`id`
WHERE (volume_units = 'Lbs') AND
(`date` between '2012-01-01' and '2012-03-07')
GROUP BY `id`
) s
this is a cross join with no visible condition on the join, i don't think you meant that
if you want to sum quantities you don't need to join at all, just group as zerkms did
You can simply group by date and volume_units without self-join.
SELECT date, volume_units, sum(volume) sum_vol
FROM receving
WHERE date between "2012-01-01" and "2012-03-07"
GROUP BY date, volume_units
Sample test:
select d, vol_units, sum(vol) sum_vol
from
(
select 1 id, '2012-03-07' d, 1 vol, 'lbs' vol_units
union
select 2 id, '2012-03-07' d, 2 vol, 'Gals' vol_units
union
select 3 id, '2012-03-08' d, 1 vol, 'lbs' vol_units
union
select 4 id, '2012-03-08' d, 2 vol, 'Gals' vol_units
union
select 5 id, '2012-03-07' d, 10 vol, 'lbs' vol_units
) t
group by d, vol_units