select: result based on occurrence of explicit value - mysql

Given is following mysql table:
CREATE TABLE fonts
(`id` int, `fontName` varchar(22), `price` int,`reducedPrice` int,`weight` int)
;
INSERT INTO fonts
(`id`, `fontName`, `price`,`reducedprice`,`weight`)
VALUES
(1, 'regular', 50,30,1),
(2, 'regular-italic', 50,20,1),
(3, 'medium', 60,30,2),
(4, 'medium-italic', 50,30,2),
(5, 'bold', 50,30,3),
(6, 'bold-italic', 50,30,3),
(7, 'bold-condensed', 50,30,3),
(8, 'super', 50,30,4)
;
As an example a user chooses following ids: 1,2,3,5,6,7
which would result in following query/result:
> select * from fonts where id in(1,2,3,5,6,7);
id fontName price reducedPrice weight
1 regular 50 30 1
2 regular-italic 50 20 1
3 medium 60 30 2
5 bold 50 30 3
6 bold-italic 50 30 3
7 bold-condensed 50 30 3
Is it possible to have a kind of "if statement" in a query to return a new field based on column weight. Where a value occurs more than once reducedPrice should be returned as newPrice else price:
id fontName price reducedPrice weight newPrice
1 regular 50 30 1 30
2 regular-italic 50 20 1 20
3 medium 60 30 2 60
5 bold 50 30 3 30
6 bold-italic 50 30 3 30
7 bold-condensed 50 30 3 30
Which means ids 1,2,5,6,7 should be reduced but id 3 not as its weight "2" only occurs once
Please find a fiddle here: http://sqlfiddle.com/#!9/73f5db/1
And thanks for your help!

Write a subquery that gets the number of occurrences of each weight, and join with this. Then you can test the number of occurrences to decide which field to put in NewPrice.
SELECT f.*, IF(weight_count = 1, Price, ReducedPrice) AS NewPrice
FROM fonts AS f
JOIN (SELECT weight, COUNT(*) AS weight_count
FROM fonts
WHERE id IN (1, 2, 3, 5, 6, 7)
GROUP BY weight) AS w ON f.weight = w.weight
WHERE id IN (1, 2, 3, 5, 6, 7)
Updated fiddle

select *,if(occurences>=2,reducedPrice,price) as newPrice from fonts
left join (Select count(id) as occurences, id,weight from fonts
where fonts.id in(1,2,3,5,6,7) group by weight) t on t.weight = fonts.weight
where fonts.id in(1,2,3,5,6,7);
The mysql if keyword reference is here:https://dev.mysql.com/doc/refman/5.1/en/control-flow-functions.html#function_if
Edit: Added fiddle, changed to instances as comment requested.
Updated fiddle:http://sqlfiddle.com/#!9/a93ef/14

SELECT DISTINCT x.*
, CASE WHEN y.weight = x.weight THEN x.reducedPrice ELSE x.price END newPrice
FROM fonts x
LEFT
JOIN
( SELECT * FROM fonts WHERE id IN(1,2,3,5,6,7) )y
ON y.weight = x.weight
AND y.id <> x.id
WHERE x.id IN(1,2,3,5,6,7)
ORDER
BY id;

Related

Is there an SQL query where we apply a set of conditions separately for each id?

Eg. If we have a table like this exam_score (record_date refers to month when record is taken, Jan = 1, Feb = 2 etc):
student
country
score
record_date
1
US
70
1
2
US
60
2
3
US
80
3
4
IT
60
2
5
IT
100
4
6
BR
80
5
Which SQL query allows me to generate a table where, for each student, we obtain the highest score obtained by fellow students in the same country before him?
So in this case, I should have something like
student
country
score
record_date
max_score
1
US
70
1
null (no usa students before him)
2
US
60
2
70 (among students before him, top score is 70)
3
US
80
3
70 (among students before him, top score is 70)
4
IT
60
2
null (no italy students before him)
5
IT
100
4
60
6
BR
80
5
null
Currently my workaround is to use Python together with SQL queries to get what I want, but could we do this with SQL alone? I'm using MySQL, but maybe the database doesn't matter in terms of the SQL query.
You can have SELECT in the column:
select s.*,
(select max(score)
from students
where country = s.country
and record_date < s.record_date) as max_score
from students s
order by record_date
Schema (MySQL v8.0)
CREATE TABLE exam_score
(`student` int, `country` varchar(2), `score` int, `record_date` int)
;
INSERT INTO exam_score
(`student`, `country`, `score`, `record_date`)
VALUES
(1, 'US', 70, 1),
(2, 'US', 60, 2),
(3, 'US', 80, 3),
(4, 'IT', 60, 2),
(5, 'IT', 100, 4),
(6, 'BR', 80, 5)
;
Query #1
SELECT student
, country
, score
, record_date
, MAX(lag_score) OVER (PARTITION BY country ORDER BY record_date) AS max_score
FROM (
SELECT student
, country
, record_date
, score
, LAG(score) OVER (PARTITION BY country ORDER BY record_date) AS lag_score
FROM exam_score
) a
ORDER BY student;
Output:
student
country
score
record_date
max_score
1
US
70
1
2
US
60
2
70
3
US
80
3
70
4
IT
60
2
5
IT
100
4
60
6
BR
80
5
View on DB Fiddle

How to list and group same records in diffrent columns?

Hello i have one table has 2 columns like
group_id product_id
2 65
2 50
2 30
2 60
2 42
5 40
5 42
6 30
6 65
6 60
7 90
7 40
I want to get list of product id's records in the same group id
You haven't specified your desired output. But if you're wanting a comma separated list by group you can do something like this answer from the below StackOverflow threads:
MySQL
SELECT t.group_id,
GROUP_CONCAT(t.product_id) AS product_id_group
FROM [YOURTABLE] t
GROUP BY t.id, t.product_id;
MySQL Results as comma separated list
SQL Sever
SELECT group_id, product_id =
STUFF((SELECT ', ' + product_id
FROM [YOURTABLE] t1
WHERE t1.group_id = t2.group_id
FOR XML PATH('')), 1, 2, '')
FROM [YOURTABLE] t2
GROUP BY group_id
SQL Server : GROUP BY clause to get comma-separated values
This will return a result set like:
group_id product_id
2 65, 50, 30, 60, 42
5 40, 42
6 30, 65, 60
7 90, 40

MySQL single column n-gram split and count

Given a column of strings (passwords) in MySQL and given a value N, i'm looking for an sql-way to count the frequency of each n-gram (substrings of length n).
It's important to keep the code inside MySQL, cause in other environments I have, it will result with memory overflow.
The only working approach I found meanwhile is by assuming limited length of the string (legit assumption), select separately by extracting different locations substrings ,union and then group by and count, like this (for 9-grams out of 13 chars):
Select
nueve,
count(*) as density,
avg(location) as avgloc
From
(select
mid(pass, 1, 9) as nueve, 1 as location
from
passdata
where
length(pass) >= 9 and length(pass) <= 13 UNION ALL select
mid(pass, 2, 9), 2 as location
from
passdata
where
length(pass) >= 10 and length(pass) <= 13 UNION ALL select
mid(pass, 3, 9), 3 as location
from
passdata
where
length(pass) >= 11 and length(pass) <= 13 UNION ALL select
mid(pass, 4, 9), 4 as location
from
passdata
where
length(pass) >= 12 and length(pass) <= 13 UNION ALL select
mid(pass, 5, 9), 5 as location
from
passdata
where
length(pass) = 13) as nueves
group by nueve
order by density DESC
The results are looking like this:
nueve density avgloc
123456789 1387 2.4564
234567890 193 2.7306
987654321 141 2.0355
password1 111 1.7748
123123123 92 1.913
liverpool 89 1.618
111111111 86 2.2791
where nueve is the 9-gram, density is the number of appearances, and avgloc is the mean starting location in the string
Any suggestions to improve the query? I'm doing the same for other n-grams too.
Thanks!
Create a table that contains all the numbers from 1 to the maximum length of passwords. You can then join with this to get the substring positions.
SELECT nueve, COUNT(*) AS density, AVG(location) as avgloc
FROM (
SELECT MID(p.pass, n.num, #N) AS nueve, n.num AS location
FROM passdata AS p
JOIN numbers_table AS n ON LENGTH(p.pass) >= (#N + n.num - 1)
) AS x
GROUP BY nueve
ORDER BY density DESC

MySQL - Multiply column by value depending on that column

How can I write this.. I have table 'Company' with a column 'Size'. The size references enums. I need to display the average company size as alias AS 'AverageEstimatedCompanySize' by substituting column 'Size' when column 'Size' is:
1 = 15
2 = 30
3 = 50
4 = 100
5 = 250
In other words, my table shows company size as either 1, 2, 3, 4 or 5. While 1 is actually a company size of 15.
This is all part of a bigger query:
SELECT COUNT(DISTINCT(ID)) AS 'Total # of Opps', AVG(Size*?) AS 'AverageEstimatedCompanySize'
FROM persontable AS POJT INNER JOIN opportunity
ON POJT.ID = opportunity.id
WHERE opportunity.TimeStamp >= '2012-01-01' AND opportunity.TimeStamp <= '2012-12-31' AND POJT.JobTitleID IN
(SELECT Id
FROM job
WHERE CategoryID IN
(SELECT id
FROM job_category
WHERE name IN ('Sc', 'Ma', 'Co', 'En', 'Tr')))
Sounds like something solvable with a case statement. The following is untested but should point you in the right direction.
SELECT
COUNT(DISTINCT(ID)) AS 'Total # of Opps',
AVG(
CASE Size
WHEN 1 THEN 15
WHEN 2 THEN 30
WHEN 3 THEN 50
WHEN 4 THEN 100
WHEN 5 THEN 250
END
) AS 'AverageEstimatedCompanySize'
FROM persontable AS POJT INNER JOIN opportunity
ON POJT.ID = opportunity.id
WHERE opportunity.TimeStamp >= '2012-01-01' AND opportunity.TimeStamp <= '2012-12-31' AND POJT.JobTitleID IN
(SELECT Id
FROM job
WHERE CategoryID IN
(SELECT id
FROM job_category
WHERE name IN ('Sc', 'Ma', 'Co', 'En', 'Tr')))
I'm thinking that one approach might be to modify the query to JOIN to the Company table appropriately (that's something you'll need to work out), and then modify the AVG statement:
... AVG(CASE `Size`
WHEN 1 THEN 15
WHEN 2 THEN 30
WHEN 3 THEN 50
WHEN 4 THEN 100
WHEN 5 THEN 250 END) AS 'AverageEstimatedCompanySize'
where Size is from the Company table.
Now, a more dynamic approach would be to create a new field, or even a new table, that maps those sizes and just JOIN the Company table and say the new table in the query and just grab the appropriate field per row then. That would get rid of the CASE statement.

Select all rows containing duplicate values in one of two columns from within distinct groups of related records

I'm trying to create a MySQL query that will return all individual rows (not grouped) containing duplicate values from within a group of related records. By 'groups of related records' I mean those with the same account number (per the sample below).
Basically, within each group of related records that share the same distinct account number, select just those rows whose values for the date or amount columns are the same as another row's values within that account's group of records. Values should only be considered duplicate from within that account's group. The sample table and ideal output details below should clear things up.
Also, I'm not concerned with any records with a status of X being returned, even if they have duplicate values.
Small sample table with relevant data:
id account invoice date amount status
1 1 1 2012-04-01 0 X
2 1 2 2012-04-01 120 P
3 1 2 2012-05-01 120 U
4 1 3 2012-05-01 117 U
5 2 4 2012-04-01 82 X
6 2 4 2012-05-01 82 U
7 2 5 2012-03-01 81 P
8 2 6 2012-05-01 80 U
9 3 7 2012-03-01 80 P
10 3 8 2012-04-01 79 U
11 3 9 2012-04-01 78 U
Ideal output returned from desired SQL query:
id account invoice date amount status
2 1 2 2012-04-01 120 P
3 1 2 2012-05-01 120 U
4 1 3 2012-05-01 117 U
6 2 4 2012-05-01 82 U
8 2 6 2012-05-01 80 U
10 3 8 2012-04-01 79 U
11 3 9 2012-04-01 78 U
Thus, row 7/9 and 8/9 should not both be returned because their duplicate values are not considered duplicate from within the scope of their respective accounts. However, row 8 should be returned because it shares a duplicate value with row 6.
Later, I may want to further hone the selection by grabbing only duplicate rows that have matching statuses, thus row 2 would be excluded because it does't match the other two found within that account's group of records. How much more difficult would that make the query? Would it just be a matter of adding a WHERE or HAVING clause, or is it more complicated than that?
I hope my explanation of what I'm trying to accomplish makes sense. I've tried using INNER JOIN but that returns each desired row more than once. I don't want duplicates of duplicates.
Table Structure and Sample Values:
CREATE TABLE payment (
id int(11) NOT NULL auto_increment,
account int(10) NOT NULL default '0',
invoice int(10) NOT NULL default '0',
date date NOT NULL default '0000-00-00',
amount int(10) NOT NULL default '0',
status char(1) NOT NULL default '',
PRIMARY KEY (id)
);
INSERT INTO payment VALUES (1, 1, 1, '2012-04-01', 0, 'X');
INSERT INTO payment VALUES (2, 1, 2, '2012-04-01', 120, 'P');
INSERT INTO payment VALUES (3, 1, 2, '2012-05-01', 120, 'U');
INSERT INTO payment VALUES (4, 1, 3, '2012-05-01', 117, 'U');
INSERT INTO payment VALUES (5, 2, 4, '2012-04-01', 82, 'X');
INSERT INTO payment VALUES (6, 2, 4, '2012-05-01', 82, 'U');
INSERT INTO payment VALUES (7, 2, 5, '2012-03-01', 81, 'p');
INSERT INTO payment VALUES (8, 2, 6, '2012-05-01', 80, 'U');
INSERT INTO payment VALUES (9, 3, 7, '2012-03-01', 80, 'U');
INSERT INTO payment VALUES (10, 3, 8, '2012-04-01', 79, 'U');
INSERT INTO payment VALUES (11, 3, 9, '2012-04-01', 78, 'U');
This type of query can be implemented as a semi join.
Semijoins are used to select rows from one of the tables in the join.
For example:
select distinct l.*
from payment l
inner join payment r
on
l.id != r.id and l.account = r.account and
(l.date = r.date or l.amount = r.amount)
where l.status != 'X' and r.status != 'X'
order by l.id asc;
Note the use of distinct, and that I'm only selecting columns from the left table. This ensures that there are no duplicates.
The join condition checks that:
it's not joining a row to itself (l.id != r.id)
rows are in the same account (l.account = r.account)
and either the date or the amount is the same (l.date = r.date or l.amount = r.amount)
For the second part of your question, you would need to update the on clause in the query.
This seems to work
select * from payment p1
join payment p2 on
(p1.id != p2.id
and p1.status != 'X'
and p1.account = p2.account
and (p1.amount = p2.amount or p1.date = p2.date))
group by p1.id
http://sqlfiddle.com/#!2/a50e9/3