SQL COUNT query similar to UNION but with results in multiple columns - mysql

I have a single-table SQL database built from DHCPD logs, structured as below:
+------+-------+------+----------+---------+-------------------+-----------------+
| id | Month | Day | Time | Type | MAC | ClientIP |
+------+-------+------+----------+---------+-------------------+-----------------+
| 9305 | Nov | 24 | 03:20:00 | DHCPACK | 00:04:f2:4b:dd:51 | 10.123.246.116 |
| 9307 | Nov | 24 | 03:20:07 | DHCPACK | 00:04:f2:99:4c:ba | 10.123.154.176 |
| 9310 | Nov | 24 | 03:20:08 | DHCPACK | 00:19:bb:cf:cd:28 | 10.99.107.3 |
| 9311 | Nov | 24 | 03:20:08 | DHCPACK | 00:19:bb:cf:cd:28 | 10.99.107.3 |
Every DHCP event from the log will eventually make its way into this database, so events from any point in time will be potentially used in the construction of graphs. To make use of the data for graphing, I need to be able to create an output table with multiple columns, but with values derived from a count of those in a single column matching a specific pattern.
The closest thing I've managed to come up with is this query:
select 'Data' as ClientIP, count(*) from Log where ClientIP like '10.99%' and MAC like '00:04:f2%'
union
select 'Voice' as ClientIP, count(*) from Log where ClientIP like '10.123%' and MAC like '00:04:f2%';
Which yields the following result:
+-----------+-------+
| ClientIP | Count |
+-----------+-------+
| Data | 4618 |
| Voice | 13876 |
+-----------+-------+
Fine for a one-off query, but I want to take those two rows, turn them into two columns, and run the same query with one row per hour (for instance). I want something like this:
+------+-------+------+
| Hour | Voice | Data |
+------+-------+------+
| 03 | 22 | 4 |
| 04 | 123 | 23 |
| 05 | 45 | 5 |
Any advice is greatly welcomed.
Thanks

You can group by hour and use conditional computation to count Data and Voice traffic.
For example:
SELECT
HOUR(time) AS `Hour`,
SUM(CASE WHEN ClientIP like '10.99%' and MAC like '00:04:f2%' THEN 1 ELSE 0 END) AS `Data`,
SUM(CASE WHEN ClientIP like '10.123%' and MAC like '00:04:f2%' THEN 1 ELSE 0 END) AS `Voice`
FROM log
GROUP BY HOUR(time)

Create a separate table for (as you want) :
+------+-------+------+
| Hour | Voice | Data |
+------+-------+------+
and update it every hour using triggers.

Related

Find most recent records with specific criteria in Access

I have a set of records that I would like to display the most recent records that match certain criteria. I've done it wrong in the past where it would first pull the most recent records and THEN go and try and match criteria which would cause some of the records to disappear. What I want to have the query do is to find the records that match criteria first and THEN have it pull the most recent records from that data set. I need to have this query INSERT INTO a Table in Access.
I thought I had it sorted out, but I get an error "Your query does not include the specified expression 'SufGrpID' as part of an aggregate function
An example of the data:
When the query runs, I would like the results to be:
An example of the data:
SufGrpID 03 would be removed from the set because it is not the newest record for CaseID 123
SufGrpID 04 would be removed from the set because it is not of SufTypeID 14 and it is not of Status F
How the data looks
+----------+---------+-------------------------+-----------+--------+
| SufGrpID | CaseID | CreateDate | SufTypeID | Status |
+----------+---------+-------------------------+-----------+--------+
| 01 | 123 | 2010-08-20 07:42:32.000 | 14 | F |
| 02 | 234 | 2010-04-28 10:33:56.000 | 14 | F |
| 03 | 123 | 2010-04-20 10:05:04.000 | 14 | F |
| 04 | 345 | 2010-08-20 11:18:42.000 | 12 | I |
| 05 | 345 | 2010-04-20 11:18:42.000 | 14 | F |
+----------+---------+-------------------------+-----------+--------+
Here's the code that did not work for me...
INSERT INTO [aStudent Base Data] ( [Self Suff ID], [Self Suff Create Date] )
SELECT dbo_sufscrgrp.SufGrpID, Max(dbo_sufscrgrp.CreateDate)
FROM dbo_sufscrgrp
WHERE (((dbo_sufscrgrp.SufTypeID)=14) AND ((dbo_sufscrgrp.Status)="F"))
GROUP BY dbo_sufscrgrp.CaseID;
What I'd like the results to be. (EDITED at 1:33 CST)
+--------------+------------------------+
| Self Suff ID | Self Suff Create Date |
+--------------+------------------------+
| 01 | 2010-08-20 07:42:32.000 |
| 02 | 2010-04-28 10:33:56.000 |
| 05 | 2010-04-20 11:18:42.000 |
+--------------+-------------------------+
Thanks for any help you can give!
Based on the minimal dataset example, consider:
SELECT dbo_sufscrgrp.*
FROM dbo_sufscrgrp
WHERE SufGrpID
IN (SELECT TOP 1 SufGrpID FROM dbo_sufscrgrp As Dupe
WHERE Dupe.CaseID=dbo_sufscrgrp.CaseID AND SufTypeID=14 and Status="F"
ORDER BY Dupe.CreateDate DESC, Dupe.SufGrpID DESC);

How can I select all rows which have been inserted in the last day?

I have a table like this:
// reset_password_emails
+----+----------+--------------------+-------------+
| id | id_user | token | unix_time |
+----+----------+--------------------+-------------+
| 1 | 2353 | 0c274nhdc62b9dc... | 1339412843 |
| 2 | 2353 | 0934jkf34098joi... | 1339412864 |
| 3 | 5462 | 3408ujf34o9gfvr... | 1339412894 |
| 4 | 3422 | 2309jrgv0435gff... | 1339412899 |
| 5 | 3422 | 34oihfc3lpot4gv... | 1339412906 |
| 6 | 2353 | 3498hfjp34gv4r3... | 1339412906 |
| 16 | 2353 | asdf3rf3409kv39... | 1466272801 |
| 7 | 7785 | 123dcoj34f43kie... | 1339412951 |
| 9 | 5462 | 3fcewloui493e4r... | 1339413621 |
| 13 | 8007 | 56gvb45cf3454g3... | 1339424860 |
| 14 | 7785 | vg4er5y2f4f45v4... | 1339424822 |
+----+----------+--------------------+-------------+
Each row is an email. Now I'm trying to implement a limitation for sending-reset-password email. I mean an user can achieve 3 emails per day (not more).
So I need an query to check user's history for the number of emails:
SELECT count(1) FROM reset_password_emails WHERE token = :token AND {from not until last day}
How can I implement this:
. . . {from now until last day}
Actually I can do that like: NOW() <= (unix_time + 86400) .. But I guess there is a better approach by using interval. Can anybody tell me what's that?
Your expression will work, but has 3 problems:
the way you've coded it means the subtraction must be performed for every row (performance hit)
because you're not using the raw column value, you couldn't use an index on the time column (if one existed)
it isn't clear to read
Try this:
unix_time > unix_timestamp(subdate(now(), interval '1' day))
here the threshold datetime is calculated once per query, so all of the problems above have been addressed.
See SQLFiddle demo
You can convert your unix_time using from_unixtime function
select r.*
from reset_password_emails r
where now() <= from_unixtime(r.unix_time) - interval '1' day
Just add the extra filters you want.
See it here: http://sqlfiddle.com/#!9/4a7a9/3
It evaluates to no rows because your given data for unix_time field is all from 2011
Edited with a sqlfiddle that show the conversion:
http://sqlfiddle.com/#!9/4a7a9/4

Different value counts on same column using LIKE

I have a database like below
+------------+---------------------------------------+--------+
| sender | subject | day |
+------------+---------------------------------------+--------+
| Darshana | Re: [Dev] [Platform] Build error | Monday |
| Dushan A | (MOLDOVADEVDEV-49) GREG Startup Error | Monday |
+------------+---------------------------------------+--------+
I want to get the result using the above table. It should check if the subject contains the given word then add one to the that word column for a given day.
|Day | "Dev" | "startup"|
+---------+------------+----------+
| Monday | 1 | 2 |
| Friday | 0 | 3 |
I was thought of using DECODE function but I couldn't get the expected result.
You can do this with conditional aggregation:
select day, sum(subject like '%Dev%') as Dev,
sum(subject like '%startup%') as startup
from table t
group by day;

mysql split a string in a where clause

I have an event system and for my repeat events I am using a cron like system.
Repeat Event:
+----+----------+--------------+
| id | event_id | repeat_value |
+----+----------+--------------+
| 1 | 11 | *_*_* |
| 2 | 12 | *_*_2 |
| 3 | 13 | *_*_4/2 |
| 4 | 14 | 23_*_* |
| 5 | 15 | 30_05_* |
+----+----------+--------------+
NOTE: The cron value is day_month_day of week
Event:
+----+------------------------+---------------------+---------------------+
| id | name | start_date_time | end_date_time |
+----+------------------------+---------------------+---------------------+
| 11 | Repeat daily | 2014-04-30 12:00:00 | 2014-04-30 12:15:00 |
| 12 | Repeat weekly | 2014-05-06 12:00:00 | 2014-05-06 13:00:00 |
| 13 | Repeat every two weeks | 2014-05-08 12:45:00 | 2014-05-08 13:45:00 |
| 14 | Repeat monthly | 2014-05-23 15:15:00 | 2014-05-23 16:00:00 |
| 15 | Repeat yearly | 2014-05-30 07:30:00 | 2014-05-30 10:15:00 |
+----+------------------------+---------------------+---------------------+
Anyway I have a query to select the events:
SELECT *
FROM RepeatEvent
JOIN `Event`
ON `Event`.`id` = `RepeatEvent`.`event_id`
That produces:
+----+----------+--------------+----+------------------------+---------------------+---------------------+
| id | event_id | repeat_value | id | name | start_date_time | end_date_time |
+----+----------+--------------+----+------------------------+---------------------+---------------------+
| 1 | 11 | *_*_* | 11 | Repeat daily | 2014-04-30 12:00:00 | 2014-04-30 12:15:00 |
| 2 | 12 | *_*_2 | 12 | Repeat weekly | 2014-05-06 12:00:00 | 2014-05-06 13:00:00 |
| 3 | 13 | *_*_4/2 | 13 | Repeat every two weeks | 2014-05-08 12:45:00 | 2014-05-08 13:45:00 |
| 4 | 14 | 23_*_* | 14 | Repeat monthly | 2014-05-23 15:15:00 | 2014-05-23 16:00:00 |
| 5 | 15 | 30_05_* | 15 | Repeat yearly | 2014-05-30 07:30:00 | 2014-05-30 10:15:00 |
+----+----------+--------------+----+------------------------+---------------------+---------------------+
However, I want to select events within a month. I will only have certain conditions: daily, weekly, every two weeks, month and yearly.
I want to put in my where clause a way to divide the string of the repeat value and if it fits any of the following conditions to show it as a result (repeatEvent is row that is being interrogated, search is the date being looked for):
array(3) = string_divide(repeat_value, '_')
daily = array(0)
monthy = array(1)
dayOfWeek = array(2)
if(daily == '*' && month == '*' && dayOfWeek == '*') //returns all the daily events as they will happen
return repeatEvent
if(if(daily == '*' && month == '*' && dayOfWeek == search.dayOfWeek) //returns all the events on specific day
return repeatEvent
if(daily == search.date && month == '*' && dayOfWeek == '*') //returns all the daily events as they will happen
return repeatEvent
if (contains(dayOfWeek, '/'))
array(2) = string_divide(dayOfWeek,'/')
specificDayOfWeek = array(0);
if(specificDayOfWeek == repeatEvent.start_date.dayNumber)
if(timestampOf(search.timestamp)-timestampOf(repeatEvent.start_date)/604800 == (0 OR EVEN)
return repeatEvent
if(daily == search.date && month == search.month && dayOfWeek == '*') //returns a single yearly event (shouldn't often crop up)
return repeatEvent
//everything else is either an unknown format of repeat_value or not an event on this day
To summarise I want to run a query in which the repeat value is split in the where clause and I can interrogate the split items. I have looked at cursors but the internet seems to advise against them.
I could process the results of selecting all the repeat events in PHP, however, I imagine this being very slow.
Here is what I would like to see if looking at the month of April:
+----------+--------------+----+------------------------+---------------------+---------------------+
| event_id | repeat_value | id | name | start_date_time | end_date_time |
+----------+--------------+----+------------------------+---------------------+---------------------+
| 11 | *_*_* | 11 | Repeat daily | 2014-04-30 12:00:00 | 2014-04-30 12:15:00 |
+----------+--------------+----+------------------------+---------------------+---------------------+
Here is what I would like to see if looking at the month of May
+----------+--------------+----+------------------------+---------------------+---------------------+
| event_id | repeat_value | id | name | start_date_time | end_date_time |
+----------+--------------+----+------------------------+---------------------+---------------------+
| 11 | *_*_* | 11 | Repeat daily | 2014-04-30 12:00:00 | 2014-04-30 12:15:00 |
| 12 | *_*_2 | 12 | Repeat weekly | 2014-05-06 12:00:00 | 2014-05-06 13:00:00 |
| 13 | *_*_4/2 | 13 | Repeat every two weeks | 2014-05-08 12:45:00 | 2014-05-08 13:45:00 |
| 14 | 23_*_* | 14 | Repeat monthly | 2014-05-23 15:15:00 | 2014-05-23 16:00:00 |
| 15 | 30_05_* | 15 | Repeat yearly | 2014-05-30 07:30:00 | 2014-05-30 10:15:00 |
+----------+--------------+----+------------------------+---------------------+---------------------+
Here is what I would like to see if looking at the month of June
+----------+--------------+----+------------------------+---------------------+---------------------+
| event_id | repeat_value | id | name | start_date_time | end_date_time |
+----------+--------------+----+------------------------+---------------------+---------------------+
| 11 | *_*_* | 11 | Repeat daily | 2014-04-30 12:00:00 | 2014-04-30 12:15:00 |
| 12 | *_*_2 | 12 | Repeat weekly | 2014-05-06 12:00:00 | 2014-05-06 13:00:00 |
| 13 | *_*_4/2 | 13 | Repeat every two weeks | 2014-05-08 12:45:00 | 2014-05-08 13:45:00 |
| 14 | 23_*_* | 14 | Repeat monthly | 2014-05-23 15:15:00 | 2014-05-23 16:00:00 |
+----------+--------------+----+------------------------+---------------------+---------------------+
You could put a bandaid on this, but no one would be doing you any favors to tell you that that is the answer.
If your MySQL database can be changed I would strongly advise you to split your current column with underscores day_month_day of year to three separate columns, day, month, and day_of_year. I would also advise you to change your format to be INT rather than VARCHAR. This will make it faster and MUCH easier to search and parse, because it is designed in a way that doesn't need to be translated into computer language through complicated programs... It is most of the way there already.
Here's why:
Reason 1: Your Table is not Optimized
Your table is not optimized and will be slowed regardless of what you choose to do at this stage. SQL is not built to have multiple values in one column. The entire point of an SQL database is to split values into different columns and rows.
The advantage to normalizing this table is that it will be far quicker to search it, and you will be able to build queries in MySQL. Take a look at Normalization. It is a complicated concept, but once you get it you will avoid creating messy and complicated programs.
Reason 2: Your Table could be tweaked slightly to harness the power of computer date/time functions.
Computers follow time based on Unix Epoch Time. It counts seconds and is always running in your computer. In fact, computers have been counting this since, as the name implies, the first Unix computer was ever switched on. Further, each computer and computer based program/system, has built in, quick date and time functions. MySQL is no different.
I would also recommend also storing all of these as integers. repeat_doy (day of year) can easily be a smallint or at least a standard int, and instead of putting a month and day, you can put the actual 1-365 day of the year. You can use DAY_OF_YEAR(NOW()) to input this into MySQL. To pull it back out as a date you can use MAKEDATE(YEAR(NOW),repeat_doy). Instead of an asterisk to signify all, you can either use 0's or NULL.
With a cron like system you probably will not need to do that sort of calculation anyway.
Instead, it will probably be easier to just measure the day of year elsewhere (every computer and language can do this. In Unix it is just date "%j").
Solution
Split your one repeat_value into three separate values and turn them all into integers based on UNIX time values. Day is 1-7 (or 0-6 for Sunday to Saturday), Month is 1-12, and day of year is 1-365 (remember, we are not including 366 because we are basing our year on an arbitrary non-leap year).
If you want to pull information in your SELECT query in your original format, it is much easier to use concat to merge the three columns than it is to try to search and split on one column. You can also easily harness built in MySQL functions to quickly turn what you pull into real, current, days, without a bunch of effort on your part.
To implement it in your SQL database:
+----+----------+--------------+--------------+------------+
| id | event_id | repeat_day | repeat_month | repeat_doy |
+----+----------+--------------+--------------+------------+
| 1 | 11 | * | * | * |
| 2 | 12 | * | * | 2 |
| 3 | 13 | * | * | 4/2 |
| 4 | 14 | 23 | * | * |
| 5 | 15 | 30 | 5 | * |
+----+----------+--------------+--------------+------------+
Now you should be able to build one query to get all of this data together regardless of how complicated your query. By normalizing your table, you will be able to fully harness the power of relational databases, without the headaches and hacks.
Edit
Hugo Delsing made a great point in the comments below. In my initial example I provided a fix to leap years for day_of_year in which I chose to ignore Feb 29. A much better solution removes the need for a fix. Split day_of_year to month and day with a compound index. He also has a suggestion about weeks and number of weeks, but I will just recommend you read it for more details.
Try to write where condition using this:
substring_index(repeat_value,'_', 1)
instead of daily
substring_index(substring_index(repeat_value,'_', -2), '_', 1)
instead of monthly
and
substring_index(substring_index(repeat_value,'_', -1), '_', 1)
instead of dayOfWeek
I think you are overthinking the problem if you only want the events per month and not per day. Assuming that you always correctly fill the repeat_value, the query is very basic.
Basically all event occur every month where the repeat_value is either LIKE '%_*_%' or LIKE '%_{month}_%'.
Since you mentions PHP I'm assuming you are building the query in PHP and thus I used the same.
<?php
function buildQuery($searchDate) {
//you could/should do some more checking if the date is valid if the user provides the string
$searchDate = empty($searchDate) ? date("Y-m-d") : $searchDate;
$splitDate = explode('-', $searchDate);
$month = $splitDate[1];
//Select everything that started after the searchdate
//the \_ is because else the _ would match any char.
$query = 'SELECT *
FROM RepeatEvent
JOIN `Event`
ON `Event`.`id` = `RepeatEvent`.`event_id`
WHERE `Event`.`start_date_time` < \''.$searchDate.'\'
AND
(
`RepeatEvent`.`repeat_value` LIKE \'%\_'.$month.'\_%\'
OR `RepeatEvent`.`repeat_value` LIKE \'%\_*\_%\'
)
';
return $query;
}
//show querys for all months on current day/year
for ($month = 1; $month<=12; $month++) {
echo buildQuery(date('Y-'.$month.'-d')) . '<hr>';
}
?>
Now if the repeat_value could be wrong, you could add a simple regex check to make sure the value is always like *_*_* or *_*_*/*
You can use basic regular expressions in MySQL:
http://dev.mysql.com/doc/refman/5.0/en/pattern-matching.html
For a monthly event in May (first day) you can use a pattern like this (not tested):
[0-9\*]+\_[5\*]\_1
You can generate this pattern via PHP

MySQL using GROUP BY to group by multiple columns

I'd like to use GROUP BY multiple columns, I think it's best to start with an example:
SELECT
eventsviews.eventId,
showsActive.showId,
showsActive.venueId,
COUNT(*) AS count
FROM eventsviews
INNER JOIN events ON events.eventId = eventsviews.eventId
INNER JOIN showsActive ON showsActive.eventId = eventsviews.eventId
WHERE events.status = 1
GROUP BY showsActive.venueId, showsActive.showId, showsActive.eventId
ORDER BY count DESC
LIMIT 100;
Output:
| *eventId* | *showId* | *venueId* | *count* |
+-----------+----------+-----------+---------+
[...snip...]
| 95 | 92099 | 9770 | 32 |
| 95 | 105472 | 10702 | 32 |
| 3804 | 41225 | 8165 | 17 |
| 3804 | 41226 | 8165 | 17 |
| 923 | 2866 | 5451 | 14 |
| 923 | 20184 | 5930 | 14 |
[...snip...]
What I would like instead:
| *eventId* | *showId* | *venueId* | *count* |
+-----------+----------+-----------+---------+
| 95 | 92099 | 9770 | 32 |
| 3804 | 41226 | 8165 | 17 |
| 923 | 20184 | 5930 | 14 |
So, I want my data grouped by eventId, but only once for each showId and venueId ...
I actually have a SQL query that does that, but it has 8 subqueries and is as slow as a T-Ford ... And since this is executed on every page load, speeding things up looks like a good idea!
There are a few questions like this, and I've tried many different things, but I've been at this query for an hour and I can't seem to get it to work as I want :-(
Thanks!
You probably want either a min or a max on showid, and then not include it in the group by, I can't tell which because looking at your "prefered" resultset, you have both.
If you want your data grouped by eventId, group just by eventId and you'll get exactly the result you're looking for.
This is a MySQL feature (?) that it allows you to select non-aggregate columns, in which case it will return the first row available. In other DBMS it's achieved by DISTINCT ON, which is not available in MySQL.