How to dynamically add months in Apache drill - apache-drill

Query that works:
select
a,
DATE_ADD(date '2015-05-15', interval '1' month)
from
(
select '1' a, INSERTED_AT from dfs.data.bi_interaction limit 3
);
Query that does not work:
select
a,
DATE_ADD(date '2015-05-15', interval a month)
from
(
select '1' a, INSERTED_AT from dfs.data.bi_interaction limit 3
);
Any workaround?

The second query doesn't work because the DATE_ADD function doesn't support a column as the interval argument. If you have a use case for that please get in touch with the Drill team on the mailing lists here: https://drill.apache.org/mailinglists/

Related

Getting all previous records of table by date MySQL

My table currently has 21000 records, it's daily updated and almost 300 entries are inserted. Now, what I want is to have a query which will fetch the counts of elements that my table had for the previous 10 days, so it returns:
26000
21300
21000
etc
Right now, I wrote this:
"SELECT COUNT(*) from tbl_task where `task_start_time` < '2020-12-01'"
And it returns 21000 but only for 1 day. I want by query to return records according to 10 days.
However, this does it for only 1 day.
edit : database flavor is mysql and date column is date not datetime
The most efficient method may be aggregation and cumulative sums:
select date(task_start_time) as dte, count(*) as cnt_on_day,
sum(count(*)) over (order by date(task_start_time)) as running_cnt
from tbl_task
group by dte
order by dte desc
limit 10;
This returns the last 10 days in the data. You can easily adjust to more days if you like -- in fact all of them -- without much trouble.
I don't know if I'm wrong, but could you not simple add a GROUP BY - statement? Like:
"SELECT COUNT(*) from tbl_task where `task_start_time` < '2020-12-01' GROUP
BY task_start_time"
EDIT:
This should only work if task_start_time is a date, not if it is a datetime
EDIT2:
If it is a datetime you could use the date function:
SELECT COUNT(*) from tbl_task where `task_start_time` < '2020-12-01' GROUP
BY DATE(task_start_time)
You can use UNION ALL and date arithmetic.
SELECT count(*)
FROM tbl_task
WHERE task_start_time < current_date
UNION ALL
SELECT count(*)
FROM tbl_task
WHERE task_start_time < date_sub(current_date, INTERVAL 1 DAY)
...
UNION ALL
SELECT count(*)
FROM tbl_task
WHERE task_start_time < date_sub(current_date, INTERVAL 9 DAY);
Edit:
You might also join a derived table that uses FROM-less SELECTs and UNION ALL to get the days to look back and then aggregate. This might be a little easier to construct dynamically. (But it may be slower I suspect.)
SELECT count(*)
FROM (SELECT 0 x
UNION ALL
SELECT 1
...
UNION ALL
SELECT 9)
INNER JOIN tbl_task t
ON t.task_start_time < date_sub(current_date, INTERVAL x.x DAY)
GROUP BY x.x;
In MySQL version 8+ you can even use a recursive CTE to construct the table with the days.
WITH RECURSIVE x
AS
(
SELECT 0 x
UNION ALL
SELECT x + 1
FROM x
WHERE x + 1 < 10
)
SELECT count(*)
FROM x
INNER JOIN tbl_task t
ON t.task_start_time < date_sub(current_date, INTERVAL x.x DAY)
GROUP BY x.x;

How ot return 0 instead of null on mysql query?

The following query returns the visitors and pageviews of last 7 days. However, if there are no results (let's say it is a fresh account), nothing is returned.
How to edit this in order to return 0 in days that there are no entries?
SELECT Date(timestamp) AS day,
Count(DISTINCT hash) AS visitors,
Count(*) AS pageviews
FROM behaviour
WHERE company_id = 1
AND timestamp >= Subdate(Curdate(), 7)
GROUP BY day
Assuming that you always have at least one record in the table for each of the last 7 days (regardless of the company_id), then you can use conditional aggregation as follows:
select
date(timestamp) as day,
count(distinct case when company_id = 1 then hash end) as visitors,
sum(company_id = 1) as pageviews
from behaviour
where timestamp >= curdate() - interval 7 day
group by day
Note that I changed you query to use standard date arithmetics, which I find easier to understand that date functions.
Otherwise, you would need to move the condition on the date from the where clause to the aggregate functions:
select
date(timestamp) as day,
count(distinct case when timestamp >= curdate() - interval 7 day and company_id = 1 then hash end) as visitors,
sum(timestamp >= curdate() - interval 7 day and company_id = 1) as pageviews
from behaviour
group by day
If your table is big, this can be expensive so I would not recommend that.
Alternatively, you can generate a derived table of dates and left join it with your original query:
select
curdate - interval x.n day day,
count(distinct b.hash) visitors,
count(b.hash) page_views
from (
select 1 n union all select 2 union all select 3 union all select 4
union all select 5 union all select 6 union all select 7
) x
left join behavior b
on b.company_id = 1
and b.timestamp >= curdate() - interval x.n day
and b.timestamp < curdate() - interval (x.n - 1) day
group by x.n
Use a query that returns all the dates from today minus 7 days to today and left join the table behaviour:
SELECT t.timestamp AS day,
Count(DISTINCT b.hash) AS visitors,
Count(b.timestamp) AS pageviews
FROM (
SELECT Subdate(Curdate(), 7) timestamp UNION ALL SELECT Subdate(Curdate(), 6) UNION ALL
SELECT Subdate(Curdate(), 5) UNION ALL SELECT Subdate(Curdate(), 4) UNION ALL SELECT Subdate(Curdate(), 3) UNION ALL
SELECT Subdate(Curdate(), 2) UNION ALL SELECT Subdate(Curdate(), 1) UNION ALL SELECT Curdate()
) t LEFT JOIN behaviour b
ON Date(b.timestamp) = t.timestamp AND b.company_id = 1
GROUP BY day
Use IFNULL:
IFNULL(expr1, 0)
From the documentation:
If expr1 is not NULL, IFNULL() returns expr1; otherwise it returns expr2. IFNULL() returns >a numeric or string value, depending on the context in which it is used.
You can use next trick:
First, get query that return 1 dummy row: SELECT 1;
Next use LEFT JOIN to connect summary row(s) without condition. This join will return values in case data exists on NULL values in other case.
Last select from joined queries onle what we need and convert NULL's to ZERO's
using IFNULL dunction.
SELECT
IFNULL(b.day,0) AS DAY,
IFNULL(b.visitors,0) AS visitors,
IFNULL(b.pageviews,0) AS pageviews
FROM (
SELECT 1
) a
LEFT JOIN (
SELECT DATE(TIMESTAMP) AS DAY,
COUNT(DISTINCT HASH) AS visitors,
COUNT(*) AS pageviews
FROM behaviour
WHERE company_id = 1
AND TIMESTAMP >= SUBDATE(CURDATE(), 7)
GROUP BY DAY
) b ON 1 = 1;

How to convert SQL result into percentage

I am trying to find the percentage increase in the last 7 days but I am a little stuck. Currently in the SQL query I have created, you can get the total of the new accounts in the last 7 days. But now, how can I improve to be able to return the result in percentage?
Here is the SQL query done so far.
Thanks
SELECT COUNT(DISTINCT account_type)
FROM account
WHERE date_created > NOW() - INTERVAL 7 DAY
You could create a temporary table with two columns, say 'old count' and 'new count'. Populate the table with the values you get from your SELECT queries.
Then, retrieve the values from the temp table to calculate the percentage difference and delete the temp table.
With purpose to run all in one query you may consider next query:
SELECT
/* Count for previous period. */
beforeCount,
/* Count for current period. */
afterCount,
/* Simple math, just calculating percentage. */
(beforeCount * 100) / afterCount AS percent
FROM (
SELECT
/* Select count for previous period. */
(
SELECT COUNT(DISTINCT account_type)
FROM account
WHERE date_created BETWEEN NOW() - INTERVAL 14 DAY AND NOW() - INTERVAL 7 DAY
) AS beforeCount,
/* Select count for current period. */
(
SELECT COUNT(DISTINCT account_type)
FROM account
WHERE date_created > NOW() - INTERVAL 7 DAY
) AS afterCount
) AS tmp
you can try below way calculate last 7days count and then calculate before 7days all then calculate percentage
select max(last7days_count) as last7days_count,
max(before7days_count) as before7days_count,
((max(before7days_count)*1.00)/max(last7days_count))*100.00 as percentage from
(
SELECT COUNT(DISTINCT account_type) as last7days_count, 0 as before7days_count
FROM account
WHERE date_created > NOW() - INTERVAL 7 DAY
union all
SELECT 0 as last7days_count COUNT(DISTINCT account_type) as before7days_count
FROM account
WHERE date_created < NOW() - INTERVAL 7 DAY
) as T
Conditional aggregation might work. Use a CASE to only count the new and another to only count the old accounts.
SELECT count(DISINCT CASE
WHEN date_created > NOW() - INTERVAL 7 DAY THEN
account_type
END)
/
count(DISTINCT CASE
WHEN date_created <= NOW() - INTERVAL 7 DAY THEN
account_type
END)
* 100 increase
FROM account;
With Temporary table you can do in like :
create temporary table storeCount IF NOT EXISTS (
oldCount INT(10) not null,
newCount INT(10) not null
);
insert into percentage (oldCount,newCount)
values
(SELECT COUNT(DISTINCT acc1.account_type)FROM account acc1, SELECT COUNT(DISTINCT acc2.account_type)
FROM account acc2 WHERE acc2.date_created > NOW() - INTERVAL 7 DAY);
select ((newCount/oldCount)*100) as percentage from storeCount;
drop temporary table IF EXISTS storeCount;
Assuming you have one row per account, you don't need distinct. I am guessing you want:
SELECT (SUM(date_created >= CURDATE() - INTERVAL 7 DAY) * 100/
SUM(date_created > CURDATE() - INTERVAL 7 DAY)
) as percent_increase
FROM account

match timestamp with date in MYSQL using PHP

I have a table
id user Visitor timestamp
13 username abc 2014-01-16 15:01:44
I have to 'Count' total visitors for a 'User' for last seven days group by date(not timestamp)
SELECT count(*) from tableA WHERE user=username GROUPBY __How to do it__ LIMIT for last seven day from today.
If any day no visitor came so, no row would be there so it should show 0.
What would be correct QUERY?
There is no need to GROUP BY resultset, you need to count visits for a week (with unspecified user). Try this:
SELECT
COUNT(*)
FROM
`table`
WHERE
`timestamp` >= (NOW() - INTERVAL 7 DAY);
If you need to track visits for a specified user, then try this:
SELECT
DATE(`timestamp`) as `date`,
COUNT(*) as `count`
FROM
`table`
WHERE
(`timestamp` >= (NOW() - INTERVAL 7 DAY))
AND
(`user` = 'username')
GROUP BY
`date`;
MySQL DATE() function reference.
Try this:
SELECT DATE(a.timestamp), COUNT(*)
FROM tableA a
WHERE a.user='username' AND DATEDIFF(NOW(), DATE(a.timestamp)) <= 7
GROUP BY DATE(a.timestamp);
i think it's work :)
SELECT Count(*)
from table A
WHERE user = username AND DATEDIFF(NOW(),timestamp)<=7

MySQL COUNT for days

I want to get the value of users visiting my page for 10 days in a chart. I need to COUNT() all the values from the last ten days.
The best layout would be
Day|COUNT(ip)
1 - 10
2 - 12
3 - 52
......
I hope you understand what I mean.
Can MySQL do this directly or need I to do this in PHP in 10 seperate querys?
Regards,
Moritz
Update with Tablestructure:
Id (Auto Increment)|Time (Unix Timestamp)|Ip|Referer
This should run fast for you
SELECT COUNT(ip) ipcount,dt FROM
(
SELECT ip,DATE(FROM_UNIXTIME(`Time`)) as dt FROM mytable
WHERE `Time` > TO_UNIXTIME(NOW() - INTERVAL 10 DAY)
) A GROUP BY dt;
Make sure you have an index on Time
ALTER TABLE mytable ADD INDEX TimeIndex (`Time`);
This will give you results with actual date values:
SELECT
COUNT(DISTINCT ip),
FROM_UNIXTIME(Time, '%m/%d/%Y') AS Day
FROM
tbl
WHERE
Time >= UNIX_TIMESTAMP(DATE_ADD(CURDATE(), INTERVAL -10 DAY))
GROUP BY
FROM_UNIXTIME(Time, '%m/%d/%Y')
try this:
SELECT CAST(DATE(FROM_UNIXTIME(`Time`)) AS CHAR) as dateoftime, COUNT(Ip) as cnt
FROM tablename
WHERE DATE(FROM_UNIXTIME(`Time`)) > DATE_SUB(current_timestamp, INTERVAL 10 DAY)
GROUP BY CAST(DATE(FROM_UNIXTIME(`Time`)) AS CHAR)