Select count from multiple tables and group by one field - mysql

I have "users" table with fields
user_name, user_id
I have data tables like
data_table_2012_10
data_table_2012_11
data_table_2012_12
data_table_2013_01
data_table_2013_02
each table contains the following fields
user_id, type ('ALARM', 'EMERGENCY', 'ALIVE', 'DEAD'), date_time
There will be millions of records in each table.
I have to select the count of type from the data_tables within the time frame given by the user, as well as have to get the corresponding name of the user with the help of user_id.
Can some one help me out with the best solution.

Try this query where DATE1 and DATE2 is your date range. You should union all tables in the inner query. Also you can try to make a query dynamically to include in the inner query only those tables that are in a date range you use:
select t.user_id,t.type, MAX(users.user_name), SUM(t.cnt)
from
(
select user_id,type,count(*) cnt
from data_table_2012_10 where date_time between DATE1 and DATE2
group by user_id,type
union all
select user_id,type,count(*) cnt
from data_table_2012_11 where date_time between DATE1 and DATE2
group by user_id,type
union all
.........................................
union all
select user_id,type,count(*) cnt
from data_table_2013_02 where date_time between DATE1 and DATE2
group by user_id,type
) t
left join users on (t.user_id=users.user_id)
group by t.user_id,t.type

Remember not to use UNION, but UNION ALL as UNION will return only merge similar rows into one and that may cause problem

Related

SQL get one time customers by email field

I have a database with over 100,000 records. I'm trying to get all customers who ordered only once searching by customer's email field (OrderEmail).
The SQL query is running for 10 minutes and then times out.
If I use short date ranges, I can get results but it still takes over 3 minutes.
How can I optimize the syntax to get it work?
SELECT
tblOrders.OrderID,
tblOrders.OrderName,
tblOrders.OrderEmail,
tblOrders.OrderPhone,
tblOrders.OrderCountry,
tblOrders.OrderDate
FROM
tblOrders
LEFT JOIN tblOrders AS orders_join ON orders_join.OrderEmail = tblOrders.OrderEmail
AND NOT orders_join.OrderID = tblOrders.OrderID
WHERE
orders_join.OrderID IS NULL
AND (tblOrders.OrderDate BETWEEN '2015-01-01' AND '2017-03-01')
AND tblOrders.OrderDelivered = - 1
ORDER BY
tblOrders.OrderID ASC;
I would expect the below to work - but I can't test it as you don't provide sample data. Well, I added a temporary table definition that could be used for the query ....
But , if you could actually change the data model to use an INTEGER id for the entity who placed the order (instead of a VARCHAR() email address), you would get considerably faster.
CREATE TEMPORARY TABLE IF NOT EXISTS
tblorders(orderid,ordername,orderemail,orderphone,ordercountry,orderdate) AS (
SELECT 1,'ORD01','adent#hog.com' ,'9-991' ,'UK', DATE '2017-01-01'
UNION ALL SELECT 2,'ORD02','tricia#hog.com','9-992' ,'UK', DATE '2017-01-02'
UNION ALL SELECT 3,'ORD03','ford#hog.com' ,'9-993' ,'UK', DATE '2017-01-03'
UNION ALL SELECT 4,'ORD04','zaphod#hog.com','9-9943','UK', DATE '2017-01-04'
UNION ALL SELECT 5,'ORD05','marvin#hog.com','9-9942','UK', DATE '2017-01-05'
UNION ALL SELECT 6,'ORD06','ford#hog.com' ,'9-993' ,'UK', DATE '2017-01-06'
UNION ALL SELECT 7,'ORD07','tricia#hog.com','9-992' ,'UK', DATE '2017-01-07'
UNION ALL SELECT 8,'ORD08','benji#hog.com' ,'9-995' ,'UK', DATE '2017-01-08'
UNION ALL SELECT 9,'ORD09','benji#hog.com' ,'9-995' ,'UK', DATE '2017-01-09'
UNION ALL SELECT 10,'ORD10','ford#hog.com' ,'9-993' ,'UK', DATE '2017-01-10'
)
;
SELECT
tblOrders.OrderID
, tblOrders.OrderName
, tblOrders.OrderEmail
, tblOrders.OrderPhone
, tblOrders.OrderCountry
, tblOrders.OrderDate
FROM tblOrders
JOIN (
SELECT
OrderEmail
FROM tblOrders
GROUP BY
OrderEmail
HAVING COUNT(*) = 1
) singleOrders
ON singleOrders.OrderEmail = tblOrders.OrderEmail
ORDER BY OrderID
;
OrderID|OrderName|OrderEmail |OrderPhone|OrderCountry|OrderDate
1|ORD01 |adent#hog.com |9-991 |UK |2017-01-01
4|ORD04 |zaphod#hog.com|9-9943 |UK |2017-01-04
5|ORD05 |marvin#hog.com|9-9942 |UK |2017-01-05
As you can see, it returns Mr. Dent, Zaphod and Marvin, who all occur only once in the example data.
Another approach that might work is that you group by email address and get only those with one entry. It may behave unpredictably if you want to get customers with multiple orders but it should be fine for this particular case:
SELECT
tblOrders.OrderID,
tblOrders.OrderName,
tblOrders.OrderEmail,
tblOrders.OrderPhone,
tblOrders.OrderCountry,
tblOrders.OrderDate,
count(tblOrders.OrderID) as OrderCount
FROM
tblOrders
WHERE
tblOrders.OrderDate BETWEEN '2015-01-01' AND '2017-03-01'
AND tblOrders.OrderDelivered = - 1
GROUP BY
tblOrders.OrderEmail
HAVING
OrderCount = 1
ORDER BY
tblOrders.OrderID ASC;
Also, I suspect that if you're seeing so long query times with just 100k records, you probably don't have an index on the OrderEmail column - I suggest setting that up and that might help with your original queries as well.
This does not work in Oracle, or SQL Server but it does work in MySQL and SQLite. So, while the code is not portable between different RDBMS, it works for this particular case.

SQL Group by day from timestamp with two tables

I have two tables with timestamp columns.
Table #1 contains clicks, timestamp and Table #2 contains userid, timestamp. I want the counts of clicks and users by date. for example
Date clicks_count users_count
2015-07-24 10 15
2015-07-24 04 06
I think these SQL useful to you.
select a.date1,clicks_count,users_count from
(select date(Table1.timestamp)as date1, count(clicks) as clicks_count
from Table1
group by date(Table1.timestamp)) as a
join
(
select date(Table2.timestamp) date2, count(userid) as users_count
from Table2
group by date(Table2.timestamp)) b on a.date1 = b.date2
Thank you.
select date(timestamp),
sum(is_click) as clicks,
sum(is_click = 0) as user_count
from
(
select timestamp, 1 as is_click from table1
union all
select timestamp, 0 from table2
) tmp
group by date(timestamp)
You can select the timestamps from both tables together and add a calculated column that indicates from which table the timestamp came from.
Then you take that subquery result and group by by the date and count the users and clicks.
sum(is_click = 0) counts how many time the timestamp came from the users table.

Get column from 2 tables and sort by date

I have 2 tables both containing an event and date column. Is there a way to combine the results of both column's event field into one and sort them by their date field. That way only a single (and combined) event is returned instead of 2.
SELECT event,date FROM table1
UNION
SELECT event,date FROM table2 ORDER BY date
When using UNION you use ORDER by at bottom query it will order marged query
You can't use it except bottom query anyway it should throw an error
SELECT a.event, MAX(a.date) date
FROM
(
SELECT event, date FROM TableA
UNION
SELECT event, date FROM TableB
) a
GROUP BY a.event
ORDER BY a.date DESC

Condition for counting distinct rows in an SQL query

I have a table in a MySQL database with an ID column. This is not a key of the table and several rows can have the same ID.
I don't really know SQL but I already figured out how to obtain the number of distinct IDs:
SELECT COUNT(DISTINCT ID) FROM mytable;
Now I want to count only those IDs which appear more than 2 times in the table.
So if the ID column contains the values
3 4 4 5 5 5 6 7 7 7
the query should return 2.
I have no idea how to do this. I hope someone can help me!
Btw, my table contains a huge number of rows. So if there are several possibilities I would also be happy to know which solution is the most efficient.
Try this:
SELECT COUNT(ID) FROM (
SELECT ID FROM mytable
GROUP BY ID
HAVING COUNT(ID) > 2) p
select count(*) from
(select count(id) as cnt,id from mytable group by id) da
where da.cnt>2
The inner query will give you how many elements does each id have. And the outer query will filter this.
SELECT
COUNT(ids)
FROM
(SELECT
COUNT(ID)AS ids
FROM
mytable
GROUP BY
ID
HAVING
ids>2
)AS tbl1
Updated :
SELECT count(ID)
FROM (
SELECT ID FROM mytable
GROUP BY ID
HAVING count(ID) > 2
) p
should do what you need

How to use query results in another query?

I am trying to write a query which will give me the last entry of each month in a table called transactions. I believe I am halfway there as I have the following query which groups all the entries by month then selects the highest id in each group which is the last entry for each month.
SELECT max(id),
EXTRACT(YEAR_MONTH FROM date) as yyyymm
FROM transactions
GROUP BY yyyymm
Gives the correct results
id yyyymm
100 201006
105 201007
111 201008
118 201009
120 201010
I don’t know how to then run a query on the same table but select the balance column where it matches the id from the first query to give results
id balance date
120 10000 2010-10-08
118 11000 2010-09-29
I've tried subqueries and looked at joins but i'm not sure how to go about using them.
You can make your first select an inline view, and then join to it. Something like this (not tested, but should give you the idea):
SELECT x.id
, t.balance
, t.date
FROM your_table t
/* here, we make your select an inline view, then we can join to it */
, (SELECT max(id) id,
EXTRACT(YEAR_MONTH FROM date) as yyyymm
FROM transactions
GROUP BY yyyymm) x
WHERE t.id = x.id