Counting users between two dates - mysql

I have a table that has just two fields - a date field and customer_id. I am looking to count the number of customer ids from each date field to current date. My query below is timing out - seems very inefficient. Is there a better way to do this?
select
t.base_date,
( select
count(distinct customer_id)
from user_base as ub
where ub.base_date >= t.base_date
and ub.base_date <= current_date
) as cts
from user_base as t

Try if this gives you same results not tested but seems the way you extracted data was not the right way of doing:
select base_date, count(distinct customer_id) as cts
from user_base
where base_date between base_date AND current_date

Related

counting occurrences between dates of different date intervals

I have a query that give me a table like this:
Person | Date_IN | Date_OUT | Structure
During a year a person ENTER and EXIT many times, ENTER and EXIT could be also the same day.
I'd like to count, for a specific day of year, how many person were IN each structure.
The final goal is to have, for a given period (1st march --> 31st march), the sum of total person for each day for each structure.
I believe the following would work. It assumes that you have a table of dates (consists of one column which contains all the dates between 1950 and 2050) and you simply join it with the person check in/out table:
SELECT dates.date, Structure, COUNT(DISTINCT Person) Persons_on_That_Date
FROM dates
LEFT JOIN turndata ON dates.date BETWEEN Date_IN AND Date_OUT
WHERE dates.date BETWEEN '2018-03-01' AND '2018-03-31'
GROUP BY dates.date, Structure
ORDER BY Structure, dates.date
Demo Here
Note: the above assumes that the out date is inclusive (the person is counted as inside on that date). If out date is exclusive then the ON clause becomes:
... ON Date_IN <= dates.date AND dates.date < Date_OUT
Please use below query, data is grouped by structure for particular timeframe.
SELECT structure, COUNT(DISTINCT person) as no_of_person
FROM table_name
WHERE DATE(Date_IN) BETWEEN '2018-08-01' AND '2018-08-31'
GROUP BY structure
You say there can be no multiple date_in for the same day and person, because a person is in at least one day. So for a given date we only must look at the latest event per person until then to see whether the person is/was in that day.
These are the steps:
create a data set for the requiered days on-the-fly
join with the table and get the last date_in until that day per person
join with the table again to get the last records
aggregate per day and count persons present
This is:
select
data.day
sum(t.date_in is not null and (t.date_out is null or t.date_out = data.day)) as count_in
from
(
select days.day, t.person, max(t.date_in) as max_date_in
from (select date '2018-03-01' as day union all ...) days
left join t on t.date_in <= days.day
group by days.day, t.person
) data
left join t on t.person = data.person and t.date_in = data.max_date_in
group by data.day
order by data.day;

Select query with calculated field that uses subquery with multiple conditions

Three Tables:
COST_SAVINGS: COST_SAVINGS_ID, ONE_TIME_CREDIT, CREATION_DATE, INVOICE_ID (FK to Invoice table)
INVOICE: INVOICE_ID, INVOICE_CURRENCY_CODE
EXCHANGE_RATE: CURRENCY_RATE, CURRENCY_DATE
I'm reporting on Cost Savings (the first table). The challenge is that each cost savings amount can be in a different currency so I need a field that shows the converted amount based on the currency from the invoice table and a matching month / year between the Exchange.Ex_Date and Cost_Savings.Create_Date.
I'm getting an error that states:
single-row subquery returns more than one row
This is what I have so far:
SELECT
COST_SAVINGS.COST_SAVINGS_ID,
COST_SAVINGS.CLAIM_TYPE,
COST_SAVINGS.COMMENTS,
COST_SAVINGS.COST_SAVINGS_STATUS,
COST_SAVINGS.CREATION_DATE,
COST_SAVINGS.DESCRIPTION,
COST_SAVINGS.ONE_TIME_CREDIT AS CREDIT_IN_NATIVE_CURRENCY,
FINANCE_INVOICE.CURRENCY_CODE,
COST_SAVINGS.ONE_TIME_CREDIT *
(SELECT EXCHANGE_RATE.CURRENCY_RATE
FROM EXCHANGE_RATE
WHERE EXTRACT (MONTH FROM COST_SAVINGS.CREATION_DATE) = EXTRACT (MONTH FROM EXCHANGE_RATE.CURRENCY_DATE)
AND EXTRACT (YEAR FROM COST_SAVINGS.CREATION_DATE) = EXTRACT (YEAR FROM EXCHANGE_RATE.CURRENCY_DATE)
AND FINANCE_INVOICE.CURRENCY_CODE = EXCHANGE_RATE.CURRENCY_CODE) AS CREDIT_IN_USD
FROM COST_SAVINGS
LEFT JOIN FINANCE_INVOICE ON COST_SAVINGS.INVOICE_ID = FINANCE_INVOICE.INVOICE_ID
I feel like the issue may be with the third WHERE clause in my subquery (Trying to match the Currency codes). I'm not sure how resolve it though. Any thoughts?
Try putting LIMIT 1 in your subquery:
...RENCY_CODE = EXCHANGE_RATE.CURRENCY_CODE LIMIT 1) AS CREDIT_IN_USD
I believe your subquery is returning multiple rows. That doesn't work when you use a subquery in place of a column name in your SELECT clause.
If you don't want to use a subquery in place of a column name, try something like this. You'll join to a subquery generated virtual table.
Here's the virtual table for exchange rates. It uses GROUP BY to make one row (or no rows per month/year/currency code. If more than one row is in the raw table for any month/year/code, they get averaged with AVG(). You could also use MAX() or MIN().
SELECT AVG(CURRENCY_RATE) CURRENCY_RATE,
CURRENCY_CODE,
EXTRACT(MONTH FROM CURRENCY_DATE) MONTH,
EXTRACT(YEAR FROM CURRENCY_DATE) YEAR
FROM EXCHANGE_RATE
GROUP BY CURRENCY_CODE,
EXTRACT(MONTH FROM CURRENCY_DATE),
EXTRACT(YEAR FROM CURRENCY_DATE)
Try this and convince yourself it works.
Then build it into your overall query.
SELECT
COST_SAVINGS.COST_SAVINGS_ID,
COST_SAVINGS.whatever, ...
FINANCE_INVOICE.CURRENCY_CODE,
(COST_SAVINGS.ONE_TIME_CREDIT * RATE.CURRENCY_RATE) AS CREDIT_IN_USD
FROM COST_SAVINGS
LEFT JOIN FINANCE_INVOICE ON COST_SAVINGS.INVOICE_ID = FINANCE_INVOICE.INVOICE_ID
LEFT JOIN (SELECT AVG(CURRENCY_RATE) CURRENCY_RATE,
CURRENCY_CODE,
EXTRACT(MONTH FROM CURRENCY_DATE) MONTH,
EXTRACT(YEAR FROM CURRENCY_DATE) YEAR
FROM EXCHANGE_RATE
GROUP BY CURRENCY_CODE,
EXTRACT(MONTH FROM CURRENCY_DATE),
EXTRACT(YEAR FROM CURRENCY_DATE)
) RATE ON RATE.CURRENCY_CODE = FINANCE_INVOICE.CURRENCY_CODE
AND RATE.YEAR = EXTRACT(YEAR FROM COST_SAVINGS.CREATION_DATE)
AND RATE.MONTH = EXTRACT(MONTH FROM COST_SAVINGS.CREATION_DATE)

related to query using SQL

In oracle sql, how to get the count of newly added customers only for the month of april and may and make sure they werent there in the previous months
SELECT CUSTOMER ID , COUNT(*)
FROM TABLE
WHERE DATE BETWEEN '1-APR-2018' AND '31-MAY-2018' AND ...
If we give max (date) and min(date), we can compare the greater date to check if this customer is new , correct?
expected output is month count
april ---
may ---
should show the exact count how many new customers joined in these two months
One approach is to use aggregation:
select customer_id, min(date) as min_date
from t
group by customer_id
having min(date) >= date '2018-04-01 and
min(date) < date '2018-06-01';
This gets the list of customers (which your query seems to be doing). To get the count, just use count(*) and make this a subquery.

Streamline a MySQL Query looking for repeat orders for a given month?

So I have a script that runs each month that looks at the previous months orders and looks to see how many of those orders were placed by a matching email address from previous years to determine the number of repeat business we are getting compared to new business.
The problem is the database is growing, the business is doing better and this is taking a very long time. I assume I need to hone my skills a little bit. Looking for help to wrap my head around it.
Right now I do a simple query:
SELECT email, COUNT(orderid) as count, SUM(total) as revenue
FROM orders
WHERE date > '2017-05-01 00:00:00';
Then I just use PHP to loop through those results doing a search for any matching email address in the previous period of time.
SELECT email, COUNT(orderid) as count, SUM(total) as revenue
FROM orders
WHERE date < '2017-05-01 00:00:00'
AND email = $email;
Of course, we are getting to the point where we are doing several thousand orders a month, and we've been doing business for several years and this process is becoming incredibly slow. Is there a way to combine this into a single query to increase performance? I've looked at subqueries but it would still be running the same number of queries, would still be just as slow just more condensed. Any ideas on how to improve this?
Right now I'm just running it once and saving the results to the report database so it only is done once each month, but I figured I should take the opportunity to ask for help also to see if I can improve.
I think this could be what you're looking for:
SELECT *
FROM (
SELECT email, COUNT(orderid) as count, SUM(total) as revenue
FROM orders
WHERE date < '2017-05-01 00:00:00'
GROUP BY email) as o1
INNER JOIN (
SELECT email, COUNT(orderid) as count, SUM(total) as revenue
FROM orders
WHERE date >= '2017-05-01 00:00:00'
GROUP BY email) as o2
ON o2.email = o1.email;
You would just have to name your aliases properly and that's it. This will run two subqueries for both periods and if there are matches in both - you'll get a result. In order to have this as most efficient as possible, create an index where date is first key.
Also, if I understood you correctly, second sub-query could not even include grouping if you're looking just for emails that have placed an order in the latest period, therefore your query could look like that:
SELECT o1.email, COUNT(o1.orderid) as count, SUM(o1.total) as revenue
FROM orders as o1
WHERE o1.date < '2017-05-01 00:00:00'
AND EXISTS (SELECT *
FROM orders AS o2
WHERE o2.email = o1.email
AND o2.date >= '2017-05-01 00:00:00')
GROUP BY o1.email;
Have you tried a nested query?
Although you're scanning the same data, there is an overhead with returning the first result set to PHP, and each subsequent query.
With a nested query you avoid this, and allow the database to make its own internal optimisations, which can be significant.
Something like this should do it:
SELECT
new_orders.email,
COUNT(new_orders.orderid) as count,
SUM(new_orders.total) as revenue
FROM
orders new_orders
join (select distinct email from orders where old_orders.date <= '2017-05-01 00:00:00') old_orders on old_orders.email = new_orders.email
WHERE
new_orders.date > '2017-05-01 00:00:00'
GROUP BY
new_orders.email

How to use query results in another query?

I am trying to write a query which will give me the last entry of each month in a table called transactions. I believe I am halfway there as I have the following query which groups all the entries by month then selects the highest id in each group which is the last entry for each month.
SELECT max(id),
EXTRACT(YEAR_MONTH FROM date) as yyyymm
FROM transactions
GROUP BY yyyymm
Gives the correct results
id yyyymm
100 201006
105 201007
111 201008
118 201009
120 201010
I don’t know how to then run a query on the same table but select the balance column where it matches the id from the first query to give results
id balance date
120 10000 2010-10-08
118 11000 2010-09-29
I've tried subqueries and looked at joins but i'm not sure how to go about using them.
You can make your first select an inline view, and then join to it. Something like this (not tested, but should give you the idea):
SELECT x.id
, t.balance
, t.date
FROM your_table t
/* here, we make your select an inline view, then we can join to it */
, (SELECT max(id) id,
EXTRACT(YEAR_MONTH FROM date) as yyyymm
FROM transactions
GROUP BY yyyymm) x
WHERE t.id = x.id