mysql avarge count - mysql

I have table named comments:
ID | COMMENT | DATE |
---|---------|----------|
1 | TEXT... | 01/01/12 |
2 | TEXT... | 01/01/12 |
3 | TEXT... | 15/01/12 |
4 | TEXT... | 01/01/13 |
In the table there are comments from 2012 and few from 2013. How can I select only records from 2012 and then get average comment count of 2012?

One really shouldn't store temporal values in string-type columns. I suggest that you first convert your DATE column to MySQL's DATE type:
ALTER TABLE comments ADD COLUMN new_date DATE AFTER `DATE`;
UPDATE comments SET new_date = STR_TO_DATE(`DATE`, '%d/%m/%y');
ALTER TABLE comments DROP COLUMN `DATE`;
(Obviously you will need to update your application code to use this new column).
Then:
How can I select only records from 2012
You can simply use the inequality operators in a filter:
SELECT *
FROM comments
WHERE new_date BETWEEN '2012-01-01' AND '2013-01-01'
and then get average comment count of 2012?
Not entirely sure what you mean by "average comment count", but if you want the mean number of comments per day:
SELECT COUNT(*) / DATEDIFF('2013-01-01', '2012-01-01')
FROM comments
WHERE new_date BETWEEN '2012-01-01' AND '2013-01-01'

Related

With SQL I want to find a date BETWEEN 2 columns with LIKE

So I'm working on a site that uses MySQL and in the reminders table i have 3 columns called date, date_to, and yearly. The dates are stored as for example 2022-06-15.
If for example there is a yearly = 1 reminder with date and date_to being 2021-06-14 and 2021-06-16, how can i look BETWEEN date and date_to if I need to use LIKE because sometimes i need to use the -06-15 part because it's yearly = 1?
I need to use -06-15 sometimes because if its a yearly reminder, i cant just check on the year too because its meant to be repeated every year.
Table example:
| date | date_to | yearly |
| -----------| -------------- | ------ |
| 2021-06-14 | 2021-06-16 | 1 |
| 2022-05-03 | 2022-05-04 | 0 |
Expected output after searching for a yearly reminder for date 06-15:
| date | date_to | yearly |
| -----------| -------------- | ------ |
| 2021-06-14 | 2021-06-16 | 1 |
If I understand your issue correctly, you just want to check if your date is between a from date and a to_date of your table. This can basically be done with this query:
SELECT date_from, date_to, yearly
FROM yourtable
WHERE '2021-06-15' BETWEEN date_from AND date_to;
If you don't care about the year, but want to check the day only, you can use DATE_FORMAT like this:
SELECT date_from, date_to, yearly
FROM yourtable
WHERE DATE_FORMAT('2021-06-15', "%m-%d")
BETWEEN DATE_FORMAT(date_from, "%m-%d")
AND DATE_FORMAT(date_to, "%m-%d")
Then it doesn't matter which year appears (in this example, 2021), only the day will be checked.
It could also be mentioned that MYSQL also provides a function DAYOFYEAR which will work correctly in most cases for a query like this:
SELECT date_from, date_to, yearly
FROM yourtable
WHERE DAYOFYEAR('2021-06-15')
BETWEEN DAYOFYEAR(date_from) AND DAYOFYEAR(date_to);
This will also ignore the year and check the day only. It's a bit easier to read, but it's less safe because it could fail when a lag year is involved since then the day of year will be one more than in other years.
Please note I changed the name of "your" column "date" to "date_from" in my answer. This is no mistake, but I recommend to do not use SQL key words or function names as table name or column name. Furthermore, the column name "date_from" better points out the difference to the column "date_to", so you should rename the column if possible.

get the SUM between two given dates

if i want to get the total_consumption over a range of dates, how would i do that?
I thought i could do:
SELECT id, SUM(consumption)
FROM consumption_info
WHERE date_time BETWEEN 2013-09-15 AND 2013-09-16
GROUP BY id;
however this returns: Empty set, 2 warnings(0.00 sec)
---------------------------------------
id | consumption | date_time |
=======================================|
1 | 5 | 2013-09-15 21:35:03 |
2 | 5 | 2013-09-15 24:35:03 |
3 | 7 | 2013-09-16 11:25:23 |
4 | 3 | 2013-09-16 20:15:23 |
----------------------------------------
any ideas what i'm doing wrong here?
thanks in advance
You're missing quotes around the date strings: the WHERE clause should actually be written as...
BETWEEN '2013-09-15' AND '2013-09-16'
The irony is that 2013-09-15 is a valid SQL expression - it means 2013 minus 09 minus 15. Obviously, there's no date lying in between the corresponding results; hence an empty set in return
Yet there might be another, more subtle error here: you probably should have used this clause...
BETWEEN '2013-09-15 00:00:00' AND '2013-09-16 23:59:59'
... instead. Without setting the time explicitly it'll be set to '00:00:00' on both dates (as DATETIME values are compared here).
While it's obviously ok for the starting date, it's not so for the ending one - unless, of course, exclusion of all the records for any time of that day but midnight is actually the desired outcome.
SELECT SUM(consumption)
FROM consumption_info
WHERE date_time >= 2013-09-15 AND date_time <= 2013-09-16;
or
SELECT SUM(consumption)
FROM consumption_info
WHERE date_time BETWEEN 2013-09-15 AND 2013-09-16;
Its better to use CAST when comparing the date function.
SELECT id, SUM(consumption)
FROM consumption_info
WHERE date_time
BETWEEN CAST('2013-09-15' AS DATETIME)
AND CAST('2013-09-16' AS DATETIME)
GROUP BY id;

Get the number of rows created for each month over a year

I have a db table named APPLICATION that has the following columns
applicationnumber varchar,
createddate datetime,
applicantname varchar,
material varchar,
location varchar
I am needed to write a query that would display the number of applications created for each month for the locations
Eg. The query result should be something like below
Location | Jan2012 | Feb2012 | Mar2012 | Apr2012
-----------------------------------------------------------------
London | 34322342 | 4342424 | 54353454 | 5434
Chicago| 43242345 | 9943455 | 85748294 | 544
The result is the number of applications created in each month for the specific location.
Each column will execute the same query logic, with just the month changing.
I tried using the MONTH() function, but I need the month matrix as a column and not as a row.
You can use the DATE_FORMAT function on the createddate field to only get the month and year and then GROUP BY location. Something like this should do it:
SELECT
`location`,
COUNT(`applicationnumber`),
DATE_FORMAT(`createddate`, '%M %Y') AS `date`
FROM `APPLICATION`
GROUP BY `location`;
Should give a result with the count of rows per month.

MySQL order by date strange prob

I has been working in a updation of a existing website. In that there was a entry form which will save in table... table structure and sample data as follows
id | name | type | in_date | year
-----------------------------------------------------
1 | name1 | 1 | 2-July | 2011
2 | name2 | 2 | 2-June | 2011
3 | name44 | 2 | 8-Sep | 2011
Now I need to order this table in whole date wise ie ( as 2-June-2011) as a simple query
SELECT * FROM order_list order by date DESC
Is any way to do this action ? I tried a lot of query .... Any way to combine these 2 rows ..
We cant alter the DB since it contains more existing records ..
You should store your dates as MySQL DATE types, rather than as strings:
ALTER TABLE order_list ADD COLUMN new_date DATE;
UPDATE order_list
SET new_date = STR_TO_DATE(CONCAT(in_date, '-', year), '%e-%b-%Y');
ALTER TABLE order_list DROP COLUMN in_date, DROP COLUMN year;
Ordering then becomes trivial (i.e. will work exactly as you have attempted):
SELECT * FROM order_list ORDER BY date DESC;
If you're unable to alter the database schema, you can perform the STR_TO_DATE operation in the ORDER BY clause (but this is not very efficient):
SELECT *
FROM order_list
ORDER BY STR_TO_DATE(CONCAT(in_date, '-', year), '%e-%b-%Y') DESC
Don't do that. Put the entire date in one column, and then, if you really have to, create computed columns that will hold the year or day/month.
You can create a simple script that will integrate those two existing columns into the one united-date column in your existing database.
You can try with + (SQL SERVER) or CONCAT (MySQL)
SELECT * FROM order_list order by in_date + year DESC

Query database in weekly interval

I have a database with a created_at column containing the datetime in Y-m-d H:i:s format.
The latest datetime entry is 2011-09-28 00:10:02.
I need the query to be relative to the latest datetime entry.
The first value in the query should be the latest datetime entry.
The second value in the query should be the entry closest to 7 days from the first value.
The third value should be the entry closest to 7 days from the second value.
REPEAT #3.
What I mean by "closest to 7 days from":
The following are dates, the interval I desire is a week, in seconds a week is 604800 seconds.
7 days from the first value is equal to 1316578202 (1317183002-604800)
the value closest to 1316578202 (7 days) is... 1316571974
unix timestamp | Y-m-d H:i:s
1317183002 | 2011-09-28 00:10:02 -> appear in query (first value)
1317101233 | 2011-09-27 01:27:13
1317009182 | 2011-09-25 23:53:02
1316916554 | 2011-09-24 22:09:14
1316836656 | 2011-09-23 23:57:36
1316745220 | 2011-09-22 22:33:40
1316659915 | 2011-09-21 22:51:55
1316571974 | 2011-09-20 22:26:14 -> closest to 7 days from 1317183002 (first value)
1316499187 | 2011-09-20 02:13:07
1316064243 | 2011-09-15 01:24:03
1315967707 | 2011-09-13 22:35:07 -> closest to 7 days from 1316571974 (second value)
1315881414 | 2011-09-12 22:36:54
1315794048 | 2011-09-11 22:20:48
1315715786 | 2011-09-11 00:36:26
1315622142 | 2011-09-09 22:35:42
I would really appreciate any help, I have not been able to do this via mysql and no online resources seem to deal with relative date manipulation such as this. I would like the query to be modular enough to be able to change the interval weekly, monthly, or yearly. Thanks in advance!
Answer #1 Reply:
SELECT
UNIX_TIMESTAMP(created_at)
AS unix_timestamp,
(
SELECT MIN(UNIX_TIMESTAMP(created_at))
FROM my_table
WHERE created_at >=
(
SELECT max(created_at) - 7
FROM my_table
)
)
AS `random_1`,
(
SELECT MIN(UNIX_TIMESTAMP(created_at))
FROM my_table
WHERE created_at >=
(
SELECT MAX(created_at) - 14
FROM my_table
)
)
AS `random_2`
FROM my_table
WHERE created_at =
(
SELECT MAX(created_at)
FROM my_table
)
Returns:
unix_timestamp | random_1 | random_2
1317183002 | 1317183002 | 1317183002
Answer #2 Reply:
RESULT SET:
This is the result set for a yearly interval:
id | created_at | period_index | period_timestamp
267 | 2010-09-27 22:57:05 | 0 | 1317183002
1 | 2009-12-10 15:08:00 | 1 | 1285554786
I desire this result:
id | created_at | period_index | period_timestamp
626 | 2011-09-28 00:10:02 | 0 | 0
267 | 2010-09-27 22:57:05 | 1 | 1317183002
I hope this makes more sense.
It's not exactly what you asked for, but the following example is pretty close....
Example 1:
select
floor(timestampdiff(SECOND, tbl.time, most_recent.time)/604800) as period_index,
unix_timestamp(max(tbl.time)) as period_timestamp
from
tbl
, (select max(time) as time from tbl) most_recent
group by period_index
gives results:
+--------------+------------------+
| period_index | period_timestamp |
+--------------+------------------+
| 0 | 1317183002 |
| 1 | 1316571974 |
| 2 | 1315967707 |
+--------------+------------------+
This breaks the dataset into groups based on "periods", where (in this example) each period is 7-days (604800 seconds) long. The period_timestamp that is returned for each period is the 'latest' (most recent) timestamp that falls within that period.
The period boundaries are all computed based on the most recent timestamp in the database, rather than computing each period's start and end time individually based on the timestamp of the period before it. The difference is subtle - your question requests the latter (iterative approach), but I'm hoping that the former (approach I've described here) will suffice for your needs, since SQL doesn't lend itself well to implementing iterative algorithms.
If you really do need to determine each period based on the timestamp in the previous period, then your best bet is going to be an iterative approach -- either using a programming language of your choice (like php), or by building a stored procedure that uses a cursor.
Edit #1
Here's the table structure for the above example.
CREATE TABLE `tbl` (
`id` int(10) unsigned NOT NULL auto_increment PRIMARY KEY,
`time` datetime NOT NULL
)
Edit #2
Ok, first: I've improved the original example query (see revised "Example 1" above). It still works the same way, and gives the same results, but it's cleaner, more efficient, and easier to understand.
Now... the query above is a group-by query, meaning it shows aggregate results for the "period" groups as I described above - not row-by-row results like a "normal" query. With a group-by query, you're limited to using aggregate columns only. Aggregate columns are those columns that are named in the group by clause, or that are computed by an aggregate function like MAX(time)). It is not possible to extract meaningful values for non-aggregate columns (like id) from within the projection of a group-by query.
Unfortunately, mysql doesn't generate an error when you try to do this. Instead, it just picks a value at random from within the grouped rows, and shows that value for the non-aggregate column in the grouped result. This is what's causing the odd behavior the OP reported when trying to use the code from Example #1.
Fortunately, this problem is fairly easy to solve. Just wrap another query around the group query, to select the row-by-row information you're interested in...
Example 2:
SELECT
entries.id,
entries.time,
periods.idx as period_index,
unix_timestamp(periods.time) as period_timestamp
FROM
tbl entries
JOIN
(select
floor(timestampdiff( SECOND, tbl.time, most_recent.time)/31536000) as idx,
max(tbl.time) as time
from
tbl
, (select max(time) as time from tbl) most_recent
group by idx
) periods
ON entries.time = periods.time
Result:
+-----+---------------------+--------------+------------------+
| id | time | period_index | period_timestamp |
+-----+---------------------+--------------+------------------+
| 598 | 2011-09-28 04:10:02 | 0 | 1317183002 |
| 996 | 2010-09-27 22:57:05 | 1 | 1285628225 |
+-----+---------------------+--------------+------------------+
Notes:
Example 2 uses a period length of 31536000 seconds (365-days). While Example 1 (above) uses a period of 604800 seconds (7-days). Other than that, the inner query in Example 2 is the same as the primary query shown in Example 1.
If a matching period_time belongs to more than one entry (i.e. two or more entries have the exact same time, and that time matches one of the selected period_time values), then the above query (Example 2) will include multiple rows for the given period timestamp (one for each match). Whatever code consumes this result set should be prepared to handle such an edge case.
It's also worth noting that these queries will perform much, much better if you define an index on your datetime column. For my example schema, that would look like this:
ALTER TABLE tbl ADD INDEX idx_time ( time )
If you're willing to go for the closest that is after the week is out then this'll work. You can extend it to work out the closest but it'll look so disgusting it's probably not worth it.
select unix_timestamp
, ( select min(unix_tstamp)
from my_table
where sql_tstamp >= ( select max(sql_tstamp) - 7
from my_table )
)
, ( select min(unix_tstamp)
from my_table
where sql_tstamp >= ( select max(sql_tstamp) - 14
from my_table )
)
from my_table
where sql_tstamp = ( select max(sql_tstamp)
from my_table )