Is there a faster way to execute the following SQL request? - mysql

I have a table that contains the following columns :
id, name, domain, added, is_verified
1, "First Google", "google.com", DATE(), 1
2, "Second Google", "google.com", DATE(), 1
3, "Third Google", "google.com", DATE(), 1
4, "First disney", "disney.com", DATE(), 1
5, "Second disney", "disney.com", DATE(), 1
6, "Third disney", "disney.com", DATE(), 0
7, "First example", "example.com", DATE(), 0
8, "Second example", "example.com", DATE(), 0
And the following request :
SELECT domain FROM mytable WHERE domain NOT IN
(SELECT domain FROM mytable WHERE is_verified = 1 GROUP BY domain)
GROUP BY domain ORDER BY added DESC;
The main idea behind this request is to get all the domain that doesn't have a is_verified at true.
In the example above, this would only return "example.com" one time.
The request works well, but takes time to execute (I have thousands of entries). Is there an other way to make this request that would be faster and efficient ?

You can use the LEFT JOIN with NULL check:
SELECT T1.Domain
FROM mytable T1
LEFT JOIN mytable T2 ON T2.domain = T1.domain AND T2.is_verified = 1
WHERE T2.ID IS NULL
Sample execution with the given data:
DECLARE #TESTDOMAIN TABLE (id int, name varchar(100), domain varchar (100), added datetime, is_verified bit)
insert into #testdomain (id, name, domain, added, is_verified)
SELECT 1, 'First Google', 'google.com', GETDATE(), 1 UNION
SELECT 2, 'Second Google', 'google.com', GETDATE(), 1 UNION
SELECT 3, 'Third Google', 'google.com', GETDATE(), 1 UNION
SELECT 4, 'First disney', 'disney.com', GETDATE(), 1 UNION
SELECT 5, 'Second disney', 'disney.com', GETDATE(), 1 UNION
SELECT 6, 'Third disney', 'disney.com', GETDATE(), 0 UNION
SELECT 7, 'First example', 'example.com', GETDATE(), 0 UNION
SELECT 8, 'Second example', 'example.com', GETDATE(), 0
SELECT T1.Domain
FROM #TESTDOMAIN T1
LEFT JOIN #TESTDOMAIN T2 ON T2.domain = T1.domain AND T2.is_verified = 1
WHERE T2.ID IS NULL

SELECT domain
FROM mytable
group by domain
having max(is_verified) = 0
ORDER BY max(added) DESC
I added the order by clause. You have to decide which added record you want to take for each domain. I chose the max added value of a domain.

Why do you have to use a sub select? Wouldn't that deliver the same result?
SELECT domain
FROM mytable
GROUP BY domain
HAVING sum(is_verified)<1;

Related

How to count the number of results in multiple group by

I have an SQL statement
SELECT
ID
, PERSON
, STATE
, VDATE
, count(PERSON)
, count(VDATE)
from myTable
group by
PERSON
, STATE
, VDATE;
I am interested in the VDATE. There could be records that have a blank VDATE and possibly more than VDATE.
My ideal result is a list where there is only one result from the previous select AND VDATE is null.
So for the following dataset
ID, PERSON, STATE, VDATE, count(PERSON), count(VDATE)
1234, 9000, ND, 2014-04-24, 1, 1
1235, 9000, ND, , 2, 2
1236, 9001, CA, , 2, 2
1237, 9002, CA, , 2, 2
1238, 9002, NV, , 2, 2
1239, 9003, MD, 2014-04-24, 2, 2
I would want 1236, 1237 and 1238 returned
Hmmm, this might be what you are describing:
select ID, PERSON, STATE, VDATE, count(PERSON), count(VDATE)
from myTable
where VDATE IS NOT NULL
group by PERSON, STATE, VDATE
UNION ALL
select NULL, NULL, NULL, NULL, count(PERSON), 0
from myTable
where VDATE IS NULL;

SQL query - credit , debit , balance

DISCLAIMER : I Know this has been asked numerous times, but all I want is an alternative.
The table is as below :
create table
Account
(Name varchar(20),
TType varchar(5),
Amount int);
insert into Account Values
('abc' ,'c', 500),
('abc', 'c', 700),
('abc', 'd', 100),
('abc', 'd', 200),
('ab' ,'c', 300),
('ab', 'c', 700),
('ab', 'd', 200),
('ab', 'd', 200);
Expected result is simple:
Name Balance
------ -----------
ab 600
abc 900
The query that worked is :
select Name, sum(case TType when 'c' then Amount
when 'd' then Amount * -1 end) as balance
from Account a1
group by Name.
All I want is, is there any query sans the 'case' statement (like subquery or self join ) for the same result?
Sure. You can use a second query with a where clause and a union all:
select name
, sum(Amount) balance
from Account a1
where TType when 'c'
group
by Name
union
all
select name
, sum(Amount * -1) balance
from Account a1
where TType when 'd'
group
by Name
Or this, using a join with an inline view:
select name
, sum(Amount * o.mult) balance
from Account a1
join ( select 'c' cd
, 1 mult
from dual
union all
select 'd'
, -1
from dual
) o
on o.cd = a1.TType
group
by Name
To be honest, I would suggest to use case...
Use the ASCII code of the char and try to go from there. It is 100 for 'd' and 99 for 'c'. Untested example:
select Name, sum((ASCII(TType) - 100) * Amount * (-1)) + sum((ASCII(TType) - 99) * Amount * (-1)))) as balance from Account a1 group by Name.
I would not recommend using this method but it is a way of achieving what you want.
select t.Name, sum(t.cr) - sum(t.dr) as balance from (select Name, case TType when 'c' then sum(Amount) else 0 end as cr, case TType when 'd' then sum(Amount) else 0 end as dr from Account group by Name, TType) t group by t.Name;
This will surely help you!!
The following worked for me on Microsoft SQL server. It has the Brought Forward balance as well
WITH tempDebitCredit AS (
Select 0 As Details_ID, null As Creation_Date, null As Reference_ID, 'Brought
Forward' As Transaction_Kind, null As Amount_Debit, null As Amount_Credit,
isNull(Sum(Amount_Debit - Amount_Credit), 0) 'diff'
From _YourTable_Name
where Account_ID = #Account_ID
And Creation_Date < #Query_Start_Date
Union All
SELECT a.Details_ID, a.Creation_Date, a.Reference_ID, a.Transaction_Kind,
a.Amount_Debit, a.Amount_Credit, a.Amount_Debit - a.Amount_Credit 'diff'
FROM _YourTable_Name a
where Account_ID = #Account_ID
And Creation_Date >= #Query_Start_Date And Creation_Date <= #Query_End_Date
)
SELECT a.Details_ID, a.Creation_Date, a.Reference_ID, a.Transaction_Kind,
a.Amount_Debit, a.Amount_Credit, SUM(b.diff) 'Balance'
FROM tempDebitCredit a, tempDebitCredit b
WHERE b.Details_ID <= a.Details_ID
GROUP BY a.Details_ID, a.Creation_Date, a.Reference_ID, a.Transaction_Kind,
a.Amount_Debit, a.Amount_Credit
Order By a.Details_ID Desc

SQL - Select Boolean Results from Table

Well ,I didn't find a correct title for this question, sorry about that.
I Have one table where I store some emails sent to users.
In this table I can know if the user read or not the email.
Table structure:
[MAILSEND_ID] (INT),
[ID_USER] (INT),
[MAIL_ID] (INT),
[READ] (BIT)
Data:
;WITH cte AS (
SELECT * FROM (VALUES
(1, 10256, 10, 0),
(1, 10257, 10, 1),
(1, 10258, 10, 1),
(1, 10259, 10, 0),
(2, 10256, 10, 0),
(2, 10257, 10, 0),
(2, 10258, 10, 1),
(2, 10259, 10, 0),
(3, 10256, 10, 1),
(3, 10257, 10, 0),
(3, 10258, 10, 0),
(3, 10259, 10, 0)
) as t(MAILSEND_ID, ID_USER, MAIL_ID, READ)
In this example, you can see, i have 4 Users and 3 Emails Sent.
User 10256
1st Email - Don't Read
2nd Email - Don't Read
3rd Email - Read
I need make a select on this table, that I give the [MAIL_ID] and a [NUMBER], this number represent the sequential e-mails that is not read by the user.
Using the last example:
Give the [NUMBER] = 3, [MAIL_ID] = 10
Return the USER_ID 10259 only.
Give the [NUMBER] = 2, [MAIL_ID] = 10
Return the USER_ID 10257, 20259.
Give the [NUMBER] = 1, [MAIL_ID] = 10
Return the USER_ID 10257, 10258, 20259.
In another words, the USER_ID can have one accumulated number of e-mails not read, but if this user read the last e-mail, he cant be returned in the query.
This is my query today, but only returns the total of emails not read:
select * from (
select
a.[USER_ID],
COUNT(a.[USER_ID]) as tt
from
emailmkt.mailing_history a
where
a.[MAIL_ID] = 58 and
a.[READ]=0
group by
[USER_ID]
) aa where tt > [NUMBER]
So the logic is not right. I Want to transfer this logic to SQL and not do this on Code, if is possible.
Sorry if have any english errors as well.
Thanks in advance.
With the following query you can get the rolling count of the mail to read by user, based of the hypothesis that mailsend_id is time related (I changed READ to IsRead 'cause I don't have the char ` on my keyboard)
SELECT ID_USER, Mail_ID
, groupid CURRENT
, #roll := CASE WHEN coalesce(#groupid, '') = groupid
THEN #roll + 1
ELSE 1
END AS roll
, #groupid := groupid OLD
FROM (SELECT mh.ID_USER, mh.Mail_ID
, concat(mh.id_user, mh.mail_id) groupid
FROM mailing_history mh
INNER JOIN (SELECT id_user
, max(CASE isread
WHEN 1 THEN MAILSEND_ID
ELSE 0
END) lastRead
FROM mailing_history
GROUP BY id_user) lr
ON mh.id_user = lr.id_user AND mh.MAILSEND_ID > lr.lastread
ORDER BY id_user, MAILSEND_ID) a
Demo: SQLFiddle
The column Roll has the rolling count of the mail to read for the user.
Adding a level you can check the value of Roll against NUMBER in a WHERE condition and group_concat the user_id

Count occurrences that differ within a column

I want to be able to select the amount of times the data in columns Somedata_A and Somedata_B has changed from the from the previous row within its column. I've tried using DISTINCT and it works to some degree. {1,2,3,2,1,1} will show 3 when I want it to show 4 course there's 5 different values in sequence.
Example:
A,B,C,D,E,F
{1,2,3,2,1,1}
A compare to B gives a difference, B compare to C gives a difference . . . E compare to F gives not difference. All in all it gives 4 differences within a set of 6 values.
I have gotten DISTINCT to work but it does not really do the trick for me. And to add more to the question I'm really not interested it the whole range, lets say just the 2 last days/entries per Title.
Second I'm concern about performance issues. I tried the query below on a real set of data and it got interrupted probably due to timeout.
SQL Fiddle
MySQL 5.5.32 Schema Setup:
CREATE TABLE testdata(
Title varchar(10),
Date varchar(10),
Somedata_A int(5),
Somedata_B int(5));
INSERT INTO testdata (Title, Date, Somedata_A, Somedata_B) VALUES
("Alpha", '123', 1, 2),
("Alpha", '234', 2, 2),
("Alpha", '345', 1, 2),
("Alpha", '349', 1, 2),
("Alpha", '456', 1, 2),
("Omega", '123', 1, 1),
("Omega", '234', 2, 2),
("Omega", '345', 3, 3),
("Omega", '349', 4, 3),
("Omega", '456', 5, 4),
("Delta", '123', 1, 1),
("Delta", '234', 2, 2),
("Delta", '345', 1, 3),
("Delta", '349', 2, 3),
("Delta", '456', 1, 4);
Query 1:
SELECT t.Title, (SELECT COUNT(DISTINCT Somedata_A) FROM testdata AS tt WHERE t.Title = tt.Title) AS A,
(SELECT COUNT(DISTINCT Somedata_B) FROM testdata AS tt WHERE t.Title = tt.Title) AS B
FROM testdata AS t
GROUP BY t.Title
Results:
| TITLE | A | B |
|-------|---|---|
| Alpha | 2 | 1 |
| Delta | 2 | 4 |
| Omega | 5 | 4 |
Something like this may work: it uses a variable for row number, joins on an offset of 1 and then counts differences for A and B.
http://sqlfiddle.com/#!2/3bbc8/9/2
set #i = 0;
set #j = 0;
Select
A.Title aTitle,
sum(Case when A.SomeData_A <> B.SomeData_A then 1 else 0 end) AVar,
sum(Case when A.SomeData_B <> B.SomeData_B then 1 else 0 end) BVar
from
(SELECT Title, #i:=#i+1 as ROWID, SomeData_A, SomeData_B
FROM testdata
ORDER BY Title, date desc) as A
INNER JOIN
(SELECT Title, #j:=#j+1 as ROWID, SomeData_A, SomeData_B
FROM testdata
ORDER BY Title, date desc) as B
ON A.RowID= B.RowID + 1
AND A.Title=B.Title
Group by A.Title
This works (see here) (FYI: Your results in the question do not match your data - for instance, for Alpha, ColumnA: it never changes from 1. The answer should be 0)
Hopefully you can adapt this Statement to your actual data model
SELECT t1.title, SUM(t1.Somedata_A<>t2.Somedata_a) as SomeData_A
,SUM(t1.Somedata_b<>t2.Somedata_b) as SomeData_B
FROM testdata AS t1
JOIN testdata AS t2
ON t1.title = t2.title
AND t2.date = DATE_ADD(t1.date, INTERVAL 1 DAY)
GROUP BY t1.title
ORDER BY t1.title;

Recursive Fill Calculations with CTE or anything efficient

Please help me with ideas (preferably CTE) to solve this as efficient as possible.
So... In the table shown, the cells in column "Value" which are red are the known values
and the highlighted greens are values to be calculated with formulas shown next to them.
I am trying to see if this is possible with CTEs at all.
It's like the last known value and its respective interval; the next known value and the respective interval; and the interval for which the value is calculated for; all are used to find the value which then intern will be used the very same way for the next unknown value.
Here is a solution.
Hope it helps. :)
;with testdata(store,shipntrvl,value)
as
(
select 'abc', 1, 0.56
union all
select 'abc', 5, null
union all
select 'abc', 10, 0.63
union all
select 'abc', 15, null
union all
select 'abc', 20, null
union all
select 'abc', 25, null
union all
select 'abc', 30, 0.96
union all
select 'xyz', 1, 0.36
union all
select 'xyz', 5, 0.38
union all
select 'xyz', 10, null
union all
select 'xyz', 15, 0.46
union all
select 'xyz', 20, null
union all
select 'xyz', 25, null
union all
select 'xyz', 30, 0.91
)
,calc
as
(
select *
,ROW_NUMBER() OVER(partition by store order by shipntrvl) as row_no
from testdata
)
,extra
as
(
select *
,(select top 1 row_no
from calc c2
where c2.row_no < c1.row_no
and c1.value is null
and c2.value is not null
and c1.store = c2.store
order by c2.row_no desc) as prev_nr
,(select top 1 row_no
from calc c2
where c2.row_no > c1.row_no
and c1.value is null
and c2.value is not null
and c1.store = c2.store
order by c2.row_no asc) as next_nr
from calc c1
)
select c.store
,c.shipntrvl
,c.value
,isnull(c.value,
(cnext.value-cprev.value)/
(cnext.shipntrvl-cprev.shipntrvl)*
(c.shipntrvl-cprev.shipntrvl)+cprev.value
) as calculated_value
from calc c
join extra
on extra.row_no = c.row_no
and extra.store = c.store
join calc cnext
on cnext.row_no = case when c.value is null
then extra.next_nr
else c.row_no
end
and c.store = cnext.store
join calc cprev
on cprev.row_no = case when c.value is null
then extra.prev_nr
else c.row_no
end
and c.store = cprev.store
Here is what I came up with (storevalue is the beginning table in your example)
with knownvalues as (
select store, shipNtrvl,value
from storevalue where Value is not null
), valueranges as
(
select
k.store,
k.ShipNtrvl as lowrange,
MIN(s.ShipNtrvl) as highrange,
(select value from storevalue where store = k.store and ShipNtrvl = MIN(s.shipNtrvl))-
(select value from storevalue where store = k.store and ShipNtrvl = k.ShipNtrvl) as term1,
MIN(s.ShipNtrvl) - k.ShipNtrvl as term2,min(k.Value) as lowval
from knownvalues k
join storevalue s on s.Value is not null and s.store= k.store and s.ShipNtrvl > k.ShipNtrvl
group by k.store, k.shipntrvl
)
select s.store,s.ShipNtrvl,v.term1/v.term2*(s.ShipNtrvl-v.lowrange)+ v.lowval as value
from storevalue s join valueranges v on v.store = s.store and s.ShipNtrvl between v.lowrange and v.highrange
where s.Value is null
union
select * from storevalue where value is not null
Just change the select to an update to write the values into the table.