I am playing around with SQL a little just so I am not completely ignorant about it if I am ever asked in a job interview. My friend was recently asked the following question at an interview and he couldn't get it and I asked somebody at work who knows SQL decently and he didn't know. Can you guys answer this problem for me and then explain how it works? Please?
*The problem*
Database normalization (or lack of normalization) often presents a challenge for developers.
Consider a database table of employees that contains three fields:
EmployeeID
EmployeeName
EmailAddresses
Every employee, identified by a unique EmployeeID, may have one or more comma-separated, #rockauto.com email address(es) in the EmailAddresses field.
The database table is defined below:
CREATE TABLE Employees
(
EmployeeID int UNSIGNED NOT NULL PRIMARY KEY,
EmployeeName varchar(50) NOT NULL,
EmailAddresses varchar(40) NOT NULL ,
PRIMARY KEY(EmployeeID)
);
For testing purposes, here is some sample data:
INSERT INTO Employees (EmployeeID, EmployeeName, EmailAddresses) VALUES
('1', 'Bill', 'bill#companyx.com'),
('2', 'Fred', 'fred#companyx.com,freddie#companyx.com'),
('3', 'Fred', 'fredsmith#companyx.com'),
('4', 'Joe', 'joe#companyx.com,joe_smith#companyx.com');
Your task is to write a single MySQL SELECT query that will show the following output for the sample data above:
Employee EmailAddress
Bill bill#companyx.com
Fred (2) fred#companyx.com
Fred (2) freddie#companyx.com
Fred (3) fredsmith#companyx.com
Joe joe#companyx.com
Joe joe_smith#companyx.com
Please take note that because there is more than one person with the same name (in this case, "Fred"), the EmployeeID is included in parenthesis.
Your query is required to written in MySQL version 5.1.41 compatible syntax. You should assume that the ordering is accomplished using standard database ascending ordering: "ORDER BY EmployeeID ASC"
For this problem, you need to submit a single SQL SELECT query. Your query should be able to process a table of 1000 records in a reasonable amount of time.
only if you have less than 10000 emails.... is that acceptable?
select
if(t1.c > 1, concat(e.employeename, ' (', e.employeeid, ')'), e.employeename) as Employee,
replace(substring(substring_index(e.EmailAddresses, ',', n.row), length(substring_index(e.EmailAddresses, ',', n.row - 1)) + 1), ',', '') EmailAddress
from
(select employeename, count(*) as c from Employees group by employeename) as t1,
(select EmployeeID, length(EmailAddresses) - length(replace(EmailAddresses,',','')) + 1 as emails from Employees) as t2,
(SELECT #row := #row + 1 as row FROM
(select 0 union all select 1 union all select 3 union all select 4 union all select 5 union all select 6 union all select 6 union all select 7 union all select 8 union all select 9) x,
(select 0 union all select 1 union all select 3 union all select 4 union all select 5 union all select 6 union all select 6 union all select 7 union all select 8 union all select 9) x2,
(select 0 union all select 1 union all select 3 union all select 4 union all select 5 union all select 6 union all select 6 union all select 7 union all select 8 union all select 9) x3,
(select 0 union all select 1 union all select 3 union all select 4 union all select 5 union all select 6 union all select 6 union all select 7 union all select 8 union all select 9) x4,
(SELECT #row:=0) as ff) as n,
Employees e
where
e.employeename = t1.employeename and
e.employeeid = t2.employeeid and
n.row <= t2.emails
order by e.employeeid;
EDIT:
With less useless numbers generated:
select
if(t1.c > 1, concat(e.EmployeeName, ' (', e.EmployeeID, ')'), e.EmployeeName) as Employee,
replace(substring(substring_index(e.EmailAddresses, ',', n.row), length(substring_index(e.EmailAddresses, ',', n.row - 1)) + 1), ',', '') as EmailAddress
from
(select EmployeeName, count(*) as c from Employees group by EmployeeName) as t1,
(select EmployeeID, length(EmailAddresses) - length(replace(EmailAddresses,',','')) + 1 as emails from Employees) as t2,
(select `1` as row from (select 1 union all select 2 union all select 3 union all select 4) x) as n,
Employees e
where
e.EmployeeName = t1.EmployeeName and
e.EmployeeID = t2.EmployeeID and
n.row <= t2.emails
order by e.EmployeeID;
And what did we learn? Poor database design results awful queries. And you can do stuff with SQL, that are probably supported only because people do poor database designs... :)
Related
I have been working on one item that is easy in excel and I can not do it in MySQL. This is a follow up question with new values and new requirements to this one:
MySQL 5.5 - count open items per day
So, again I have got the same table in excel and I want to achive Count_open in MySQL.
Excel's formula is =COUNTIFS($A$2:$A$30000,"<="&E2,$B$2:$B$30000,">="&E2)
So, in my T1 table I have got two dates, open and close and I want to calculate how many where open per date.
Previously I used temp table for the last 7 days but this time I need to just stick to T1 table.
To get T1 table, I use the following code:
CREATE TABLE T1
(
ID int (10),
Open_Date date,
Close_Date date);
insert into T1 values (1, '2018-12-17', '2018-12-18');
insert into T1 values (2, '2018-12-18', '2018-12-18');
insert into T1 values (3, '2018-12-18', '2018-12-18');
insert into T1 values (4, '2018-12-19', '2018-12-20');
insert into T1 values (5, '2018-12-19', '2018-12-21');
insert into T1 values (6, '2018-12-20', '2018-12-22');
insert into T1 values (7, '2018-12-20', '2018-12-22');
insert into T1 values (8, '2018-12-21', '2018-12-25');
insert into T1 values (9, '2018-12-22', '2018-12-26');
insert into T1 values (10, '2018-12-23', '2018-12-27');
So far I have tried below code but it does not yield the correct results.
SELECT T1.Open_Date, count(*) FROM T1
WHERE
T1.Open_Date>='2018-12-01' and t1.Close_Date <='2019-03-17'
GROUP BY T1.Open_Date;
I am lost at the moment and your help is much needed!
The difference between Excel and a database is that you have manually generated the dates first in Excel. You could do that too in mysql and write a list of queries each for every date. That is basically the same as you do in your excel.
But luckily mysql isn't excel, so we can automate that. First we must generate a interval of dates. There is a big thread about that here: generate days from date range.
Then we just have to group the valid dates and voila:
Select a.Date, Count(t.ID)
from (
select curdate() - INTERVAL (a.a + (10 * b.a) + (100 * c.a) + (1000 * d.a) ) DAY as Date
from (select 0 as a union all select 1 union all select 2 union all select 3 union all select 4 union all select 5 union all select 6 union all select 7 union all select 8 union all select 9) as a
cross join (select 0 as a union all select 1 union all select 2 union all select 3 union all select 4 union all select 5 union all select 6 union all select 7 union all select 8 union all select 9) as b
cross join (select 0 as a union all select 1 union all select 2 union all select 3 union all select 4 union all select 5 union all select 6 union all select 7 union all select 8 union all select 9) as c
cross join (select 0 as a union all select 1 union all select 2 union all select 3 union all select 4 union all select 5 union all select 6 union all select 7 union all select 8 union all select 9) as d
) a, T1 t
where a.Date between '2018-12-01' and '2019-03-17'
and a.Date between t.Open_Date and t.Close_Date
group by a.Date
Database Table
ID Post Tags
1 Range rover range-rover,cars
2 Lamborghini lamborghini,cars
3 Kawasaki kawasaki,bikes
4 Yamaha R1 yamaha,r1,bikes
I Want to Remove Duplicate Values from Result sql
What i Get When i fetch tags (tags are in ,) from Database
SELECT Tags from posts;
Resut:
range-rover,cars lamborghini,cars kawasaki,bikes yamaha,r1,bikes
What I Need is not to show same result again.
range-rover,cars lamborghini kawasaki,bikes yamaha,r1
You can split your text using tally table and SUBSTRING_INDEX:
SELECT DISTINCT SUBSTRING_INDEX(SUBSTRING_INDEX(t.tags, ',', n.n), ',', -1) AS val
FROM posts t
CROSS JOIN
(
SELECT a.N + b.N * 10 + 1 n
FROM
(SELECT 0 AS N UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) a
,(SELECT 0 AS N UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) b
) n
WHERE n.n <= 1 + (LENGTH(t.tags) - LENGTH(REPLACE(t.tags, ',', '')))
SqlFiddleDemo
If you need one row add GROUP_CONCAT:
SELECT GROUP_CONCAT(DISTINCT SUBSTRING_INDEX(SUBSTRING_INDEX(t.tags, ',', n.n), ',', -1)) AS val
...
SqlFiddleDemo2
I want to save the hassle of doing many querys for the following:
I have a table like this:
name, age
{
Mike, 7
Peter, 2
Mario, 1
Tony, 4
Mary, 2
Tom, 7
Jerry, 3
Nick, 2
Albert, 22
Steven, 7
}
And I want the following result:
Results(custom_text, num)
{
1 Year, 1
2 Year, 3
3 Year, 1
4 Year, 1
5 Year, 0
6 Year, 0
7 Year, 3
8 Year, 0
9 Year, 0
10 Year, 0
More than 10 Year, 1
}
I know how to do this but in 11 queries :( But how to simplify it?
EDIT:
Doing the following, I can obtain the non zero values, but I need the zeroes in the right places.
SELECT COUNT(*) AS AgeCount
FROM mytable
GROUP BY Age
How can I achieve this?
Thanks for reading.
you can use below query but it will not show the gaps if you want gaps then the use Linoff's answer:
select t.txt, count(t.age) from
(select
case
when age<11 then concat(age ,' year')
else 'more than 10'
end txt, age
from your_table)t
group by t.txt
order by 1
SQL FIDDLE DEMO
You can use left join and a subquery to get what you want:
select coalesce(concat(ages.n, ' year'), 'More than 10 year') as custom_text,
count(*)
from (select 1 as n union all select 2 union all select 3 union all select 4 union all
select 5 union all select 6 union all select 7 union all select 8 union all
select 9 union all select 10 union all select null
) ages left join
tabla t
on (t.age = ages.n or ages.n is null and t.age > 10)
group by ages.n;
EDIT:
I think the following is a better way to do this query:
select (case when least(age, 11) = 11 then 'More than 10 year'
else concat(age, ' year')
end) as agegroup, count(name)
from (select 1 as age, NULL as name union all
select 2, NULL union all
select 3, NULL union all
select 4, NULL union all
select 5, NULL union all
select 6, NULL union all
select 7, NULL union all
select 8, NULL union all
select 9, NULL union all
select 10, NULL union all
select 11, NULL
union all
select age, name
from tabla t
) t
group by least(age, 11);
Basically, the query need a full outer join and MySQL does not provide one. However, we can get the same result by adding in extra values for each age, so we know something is there. Then because name is NULL, the count(name) will return 0 for those rows.
Please try using this query for required output.
SQL FIDDLE link http://www.sqlfiddle.com/#!9/4e52a/6
select coalesce(concat(ages.n, ' year'), 'More than 10 year') as custom_text,
count(t.age) from (select 1 as n union all select 2 union all select 3 union all select 4 union all select 5 union all select 6 union all select 7 union all select 8 union all select 9 union all select 10 union all select null ) ages left join tabla t
on (case when ages.n<11 then t.age = ages.n else t.age > 10 end)
group by ages.n;
I do an update in a table row by row:
UPDATE table
SET col = $value
WHERE id = $id
Now if I update e.g. 10000 records each record gets the $value but it does not really matter which $id gets which $value. The only requirement I have is that all the records I am updating end up with a $value.
So how could I convert this update to something like
UPDATE table
SET col ?????? what here from a $value_list???
WHERE id IN ($id_list)
I.e. pass the list ids and somehow the values and that range of ids get a value
Let's assume you've got two comma separated lists of your ids and your values with the same count of items. Then you could do your update with statements like those:
-- the list of the ids
SET #ids = '2,4,5,6';
-- the list of the values
SET #vals = '17, 73,55, 12';
UPDATE yourtable
INNER JOIN (
SELECT
SUBSTRING_INDEX(SUBSTRING_INDEX(t.ids, ',', n.n), ',', -1) id,
SUBSTRING_INDEX(SUBSTRING_INDEX(t.vals, ',', n.n), ',', -1) val
FROM (SELECT #ids as ids, #vals as vals) t
CROSS JOIN (
-- build for up to 1000 separated values
SELECT
1 + a.N + b.N * 10 + c.N * 100 AS n
FROM
(SELECT 0 AS N UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) a
,(SELECT 0 AS N UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) b
,(SELECT 0 AS N UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) c
ORDER BY n
) n
WHERE n <= (1 + LENGTH(t.ids) - LENGTH(REPLACE(t.ids, ',', '')))
) t1
ON
yourtable.id = t1.id
SET
yourtable.val = t1.val;
Explanation
The inner series of UNIONs builds a table with the numbers from 1 to 1000. You should be able to expand this mechanism to your needs:
-- build for up to 1000 separated values
SELECT
a.N + b.N * 10 + c.N * 100 + 1 AS n
FROM
(SELECT 0 AS N UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) a
,(SELECT 0 AS N UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) b
,(SELECT 0 AS N UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) c
ORDER BY n
We use this numbers to get the items out of our lists with the nested SUBSTRING_INDEX call
SUBSTRING_INDEX(SUBSTRING_INDEX(t.ids, ',', n.n), ',', -1) id,
SUBSTRING_INDEX(SUBSTRING_INDEX(t.vals, ',', n.n), ',', -1) val
The WHERE clause get the number of items in (ok only one of the two) lists:
WHERE n <= (1 + LENGTH(t.ids) - LENGTH(REPLACE(t.ids, ',', '')))
Because we've got one occurence of the separator less, we add 1 to the difference in length of the list with the separator and the length of the list without all occurrences of the separator.
Then we do the UPDATE with a JOIN operation on the id values in the outer UPDATE statement.
See it working in this fiddle.
Believe me: This is much faster than agonizing row-by-row update.
Good Evening,
I have a query like this:
SELECT #c:=#c+1 as Count, CurrentISP
FROM (
SELECT 'yahoo.com' as currentISP
UNION ALL SELECT 'yahoo.com'
UNION ALL SELECT 'gmail.com'
UNION ALL SELECT 'gmail.com'
UNION ALL SELECT 'hotmail.com'
UNION ALL SELECT 'hotmail.com'
) t
INNER JOIN ( SELECT #c:=0 ) c
Which produces a result set as follows:
Count CurrentISP
--
1 yahoo.com
2 yahoo.com
3 gmail.com
4 gmail.com
5 hotmail.com
6 hotmail.com
What I want to do is give this set an Ordering as follow:
1) yahoo.com
2) gmail.com
3) hotmail.com
4) maybe some ither Email Provider
5) when All providers are over start over again
6) yahoo.com etc etc
The reason I would like to do this, is because is to avoid spamming a certain provider with emails at the same time... So I can increase my sender score reputation back to 100%.
This is a little hacky but you can order by a count for each time the name appears.. this will loop for as many records as you put in.
SELECT #D:=#D+1 AS Count, CurrentISP
FROM
( SELECT #C:=0, #A:=0, #B:=0, #D:=0 ) temp,
(
SELECT *
FROM (
SELECT 'yahoo.com' AS currentISP
UNION ALL SELECT 'yahoo.com'
UNION ALL SELECT 'gmail.com'
UNION ALL SELECT 'gmail.com'
UNION ALL SELECT 'hotmail.com'
UNION ALL SELECT 'hotmail.com'
) t
ORDER BY
CASE
WHEN currentISP='yahoo.com' THEN #A := #A + 1
WHEN currentISP='gmail.com' THEN #B := #B + 1
WHEN currentISP='hotmail.com' THEN #C := #C + 1
END DESC
) AS t
this is what the query returns: IMAGE
If you want to order a list of duplicate values, maybe something like this would be better
select currentIsp
from
(
select 1 as n
union select 2
-- Add as many unique numbers you need
) as a,
(
select 'yahoo.com' as currentISP
union select 'gmail.com'
union select 'hotmail.com'
-- More UNION queries with unique ISP names
) as t
order by a.n, t.currentIsp
(I don't think you need a Count column, but you can add it if you want)