I'm trying to create a SQL query that retrieves records based on the start and end dates. However, the system doesn't have a very nice Database schema, so I'm looking for a way to create a query that works with it.
Basically, there's a meta table that stores the events' dates using the "date_NUMBER_(start|end)" format:
event_id
meta_key
meta_value
1
date_0_start
2022-04-02
1
date_0_end
2022-04-03
-
-
-
2
date_0_start
2022-03-21
2
date_0_end
2022-03-22
2
date_1_start
2022-06-24
2
date_1_end
2022-06-30
So, Event 1 has one date span: 2022-04-02 to 2022-04-03. While Event 2 has two: 2022-03-21 to 2022-03-22 and 2022-06-24 to 2022-06-30.
After 2022-03-22, Event 2 is not active, starting again only on 2022-06-24.
To create a filter that works with this schema, I came up with the following SQL query. For example, to search for events happening between the 2022-04-01 - 2022-04-03 range (any event that there's an event day between the range):
SELECT events.id
FROM events INNER JOIN events_meta ON ( events.id = events_meta.event_id ) INNER JOIN events_meta AS evt1 ON (
events.id = evt1.event_id )
WHERE
(
(
events_meta.meta_key LIKE 'date_%_start' AND
CAST(events_meta.meta_value AS DATE) >= '2022-04-01'
)
AND
(
evt1.meta_key LIKE 'date_%_end' AND
CAST(evt1.meta_value AS DATE) <= '2022-04-03'
)
)
The problem is that in this way, I should get only Event 1, but Event 2 is also returned since date_1_start is >= '2022-04-01' and date_0_end is <= '2022-04-03'
I'm not being able to find a way where I can match the NUMBER in the "date_NUMBER_(start|end)" meta_key format, so the query doesn't compare different NUMBERs.
Any help is appreciated :)
I made a fiddle with the INNER JOIN query:
http://sqlfiddle.com/#!9/fb497e/2
Use a self-join to get different keys for the same event ID. See MYSQL Select from tables based on multiple rows
SELECT m1.event_id, m1.meta_value AS start, m2.meta_value AS end
FROM events_meta AS m1
JOIN events_meta AS m2
ON m1.event_id = m2.event_id
AND SUBSTRING_INDEX(m1.meta_key, '_', 2) = SUBSTRING_INDEX(m2.meta_key, '_', 2)
WHERE m1.meta_key LIKE 'date_%_start' AND m2.meta_key LIKE 'date_%_end'
AND CAST(m1.meta_value AS DATE) >= '2022-04-01'
AND CAST(m2.meta_value AS DATE) <= '2022-04-03'
The SUBSTRING_INDEX() calls will return the prefixes date_0, date_1, etc. Including this in the ON condition will pair up the corresponding start and end keys.
Related
I have created a query that will count the number of posts and group them by the date, however the result doesn't show the dates when there were no posts.
QUERY:
SELECT DATE(Date_Uploaded) AS ForDate
, COUNT(*) AS NumPosts
FROM Articles
WHERE 'Status'='4'
GROUP
BY DATE(Date_Uploaded)
ORDER
BY ForDate
for info 'Status' = 4 means the post is published on the site and 'Date_Uploaded' is the timestamp of publication.
This will for example return;
2020-09-10: 2
2020-09-14: 1
2020-09-25: 4
However I want;
2020-09-10: 2
2020-09-11: 0
2020-09-12: 0
2020-09-13: 0
2020-09-14: 1
etc.
The reason I need my data like this is so I can use it with google charts to create a column chart that shows the number of posts over time. by not including the dates with no posts the chart will not format the missing dates.
If there is a way to still use the same query but have google charts space the data appropriately this would also be a great solution.
Thanks.
EDIT: The date range would be defined on the same page I intend to place the chart using the data
If you have data for all dates, but just not for status = 4, then you can use conditional aggregation
SELECT DATE(Date_Uploaded) AS ForDate,
SUM(CASE WHEN Status = 4 THEN 1 ELSE 0 END) AS NumPosts
FROM Articles
GROUP BY DATE(Date_Uploaded)
ORDER BY ForDate;
Otherwise, you need to use a table that has dates (or numbers) or generate them and use LEFT JOIN. However, the exact syntax depends on the database.
EDIT:
In MySQL, you can use a recursive CTE to generate the dates:
with recursive dates as (
select date('2020-09-10') as date
union all
select date + interval 1 day
from dates
where date < '2020-09-25'
)
select d.date, count(a.date_uploaded)
from dates d left join
articles a
on a.date_uploaded >= d.date and
a.date_uploaded < d.date + interval 1 day and
a.status = 4
group by d.date;
I'm developing a booking system and in my booking form I have a dropdown element which is returning (still) available start time slots for a booking system.
By creating a new booking the query I have created is working fine and all available start time slots are returned correctly.
QUERY :
WHERE {thistable}.id
IN (
SELECT id +3
FROM (
SELECT p1.book_date, t.*, count(p1.book_date) AS nbre
FROM fab_booking_taken AS p1
CROSS JOIN fab_booking_slots AS t
WHERE NOT ((t.heuredepart_resa < p1.book_end AND
t.heurearrivee_resa > p1.book_start))
AND DATE(p1.book_date)=DATE('{fab_booking___book_bookingdate}')
GROUP BY t.id) AS x
WHERE nbre =
(
SELECT count(p2.book_date)
FROM fab_booking_taken AS p2
WHERE p2.book_date = x.book_date
)
) ORDER BY id ASC
Please see video : booking creationg
The problem I have by using the same query by editing an existing booking the available start time slots are returned which is fine :
18:00
18:30
19:00
19:30
but not the already by the customer chosen (and in the database saved) time slot which is in my example 14:00.
Please see video : Editing booking with same query
Dropdown should be populated with the following options :
14:00
18:00
18:30
19:00
19:30
I tried to create an union query to get the already by the customer chosen start time slot and the (still) available start time slots.
QUERY :
{thistable}.id
IN (
SELECT id + 3
FROM (
SELECT p1.book_date, t.*, count(p1.book_date) AS nbre
FROM fab_booking_taken AS p1
CROSS JOIN fab_booking_slots AS t
WHERE NOT ((t.heuredepart_resa < p1.book_end
AND t.heurearrivee_resa > p1.book_start))
AND p1.book_date = DATE_FORMAT('{fab_booking___book_bookingdate}', '%Y-%m-%d')
GROUP BY t.id
) as foobar2
UNION (
SELECT id + 3
FROM (
SELECT p1.book_date, t.*, count(p1.book_date) AS nbre
FROM fab_booking_taken AS p1
CROSS JOIN fab_booking_slots AS t
WHERE ( ( t.heuredepart_resa < p1.book_end
AND t.heurearrivee_resa > p1.book_start ) )
AND t.id = '{fab_booking___book_starttime}'
AND p1.book_date = DATE_FORMAT('{fab_booking___book_bookingdate}', '%Y-%m-%d')
GROUP BY t.id
) AS x
WHERE nbre = (
SELECT count(p2.book_date)
FROM fab_booking_taken AS p2
WHERE p2.book_date = x.book_date
)
)
)
The already by the customer chosen start time slot is returned (14:00) but the other available returned start time slots are not correct.
Please see video : Editing booking with union query
I'm stuck and I have no clue how I could solve this issue, so I would appreciate some help here.
Thanks
Relevant database tables
fab_booking with the booking concerned into the video
please download the sql table
fab_booking_taken with the already existing bookings on 25 11 2016 id = 347
Please download the sql table
id 347 is the concerned booking
fab_booking_slots table which contains all possible time slots
Please download the sql table
fab_heuredepart_resa table which populate the dropdown element
Please download the sql table
I have to admit that it is daunting to try to untangle that query to understand the logic behind it but I think that the following query should return the results that you need.
{thistable}.id IN (
/*Finds booking slots where the booking slot does not overlap
with any of the existing bookings on that day,
or where the booking slot id is the same as the current slot.*/
SELECT t.id + 3
FROM fab_booking_slots AS t
WHERE t.id = '{fab_booking___book_starttime}'
OR NOT EXISTS (
Select 1
From fab_booking_taken AS p1
Where Date(p1.book_date) = Date('{fab_booking___book_bookingdate}')
And p1.book_end > t.heuredepart_resa
And p1.book_start < t.heurearrivee_resa
)
)
Order By id Asc;
I'm pretty sure that this is logically equivalent, and once expressed in a simplified form like this it's easier to see how you can get it to also return the additional time slot.
You should have a separate query to use when populating the time slots for a new booking that doesn't have an existing time slot, in which case you can just remove the single line t.id = '{fab_booking___book_starttime}' OR.
Suppose I have a table with 3 columns:
id (PK, int)
timestamp (datetime)
title (text)
I have the following records:
1, 2010-01-01 15:00:00, Some Title
2, 2010-01-01 15:00:02, Some Title
3, 2010-01-02 15:00:00, Some Title
I need to do a GROUP BY records that are within 3 seconds of each other. For this table, rows 1 and 2 would be grouped together.
There is a similar question here: Mysql DateTime group by 15 mins
I also found this: http://www.artfulsoftware.com/infotree/queries.php#106
I don't know how to convert these methods into something that will work for seconds. The trouble with the method on the SO question is that it seems to me that it would only work for records falling within a bin of time that starts at a known point. For instance, if I were to get FLOOR() to work with seconds, at an interval of 5 seconds, a time of 15:00:04 would be grouped with 15:00:01, but not grouped with 15:00:06.
Does this make sense? Please let me know if further clarification is needed.
EDIT: For the set of numbers, {1, 2, 3, 4, 5, 6, 7, 50, 51, 60}, it seems it might be best to group them {1, 2, 3, 4, 5, 6, 7}, {50, 51}, {60}, so that each grouping row depends on if the row is within 3 seconds of the previous. I know this changes things a bit, I'm sorry for being wishywashy on this.
I am trying to fuzzy-match logs from different servers. Server #1 may log an item, "Item #1", and Server #2 will log that same item, "Item #1", within a few seconds of server #1. I need to do some aggregate functions on both log lines. Unfortunately, I only have title to go on, due to the nature of the server software.
I'm using Tom H.'s excellent idea but doing it a little differently here:
Instead of finding all the rows that are the beginnings of chains, we can find all times that are the beginnings of chains, then go back and ifnd the rows that match the times.
Query #1 here should tell you which times are the beginnings of chains by finding which times do not have any times below them but within 3 seconds:
SELECT DISTINCT Timestamp
FROM Table a
LEFT JOIN Table b
ON (b.Timestamp >= a.TimeStamp - INTERVAL 3 SECONDS
AND b.Timestamp < a.Timestamp)
WHERE b.Timestamp IS NULL
And then for each row, we can find the largest chain-starting timestamp that is less than our timestamp with Query #2:
SELECT Table.id, MAX(StartOfChains.TimeStamp) AS ChainStartTime
FROM Table
JOIN ([query #1]) StartofChains
ON Table.Timestamp >= StartOfChains.TimeStamp
GROUP BY Table.id
Once we have that, we can GROUP BY it as you wanted.
SELECT COUNT(*) --or whatever
FROM Table
JOIN ([query #2]) GroupingQuery
ON Table.id = GroupingQuery.id
GROUP BY GroupingQuery.ChainStartTime
I'm not entirely sure this is distinct enough from Tom H's answer to be posted separately, but it sounded like you were having trouble with implementation, and I was thinking about it, so I thought I'd post again. Good luck!
Now that I think that I understand your problem, based on your comment response to OMG Ponies, I think that I have a set-based solution. The idea is to first find the start of any chains based on the title. The start of a chain is going to be defined as any row where there is no match within three seconds prior to that row:
SELECT
MT1.my_id,
MT1.title,
MT1.my_time
FROM
My_Table MT1
LEFT OUTER JOIN My_Table MT2 ON
MT2.title = MT1.title AND
(
MT2.my_time < MT1.my_time OR
(MT2.my_time = MT1.my_time AND MT2.my_id < MT1.my_id)
) AND
MT2.my_time >= MT1.my_time - INTERVAL 3 SECONDS
WHERE
MT2.my_id IS NULL
Now we can assume that any non-chain starters belong to the chain starter that appeared before them. Since MySQL doesn't support CTEs, you might want to throw the above results into a temporary table, as that would save you the multiple joins to the same subquery below.
SELECT
SQ1.my_id,
COUNT(*) -- You didn't say what you were trying to calculate, just that you needed to group them
FROM
(
SELECT
MT1.my_id,
MT1.title,
MT1.my_time
FROM
My_Table MT1
LEFT OUTER JOIN My_Table MT2 ON
MT2.title = MT1.title AND
(
MT2.my_time < MT1.my_time OR
(MT2.my_time = MT1.my_time AND MT2.my_id < MT1.my_id)
) AND
MT2.my_time >= MT1.my_time - INTERVAL 3 SECONDS
WHERE
MT2.my_id IS NULL
) SQ1
INNER JOIN My_Table MT3 ON
MT3.title = SQ1.title AND
MT3.my_time >= SQ1.my_time
LEFT OUTER JOIN
(
SELECT
MT1.my_id,
MT1.title,
MT1.my_time
FROM
My_Table MT1
LEFT OUTER JOIN My_Table MT2 ON
MT2.title = MT1.title AND
(
MT2.my_time < MT1.my_time OR
(MT2.my_time = MT1.my_time AND MT2.my_id < MT1.my_id)
) AND
MT2.my_time >= MT1.my_time - INTERVAL 3 SECONDS
WHERE
MT2.my_id IS NULL
) SQ2 ON
SQ2.title = SQ1.title AND
SQ2.my_time > SQ1.my_time AND
SQ2.my_time <= MT3.my_time
WHERE
SQ2.my_id IS NULL
This would look much simpler if you could use CTEs or if you used a temporary table. Using the temporary table might also help performance.
Also, there will be issues with this if you can have timestamps that match exactly. If that's the case then you will need to tweak the query slightly to use a combination of the id and the timestamp to distinguish rows with matching timestamp values.
EDIT: Changed the queries to handle exact matches by timestamp.
Warning: Long answer. This should work, and is fairly neat, except for one step in the middle where you have to be willing to run an INSERT statement over and over until it doesn't do anything since we can't do recursive CTE things in MySQL.
I'm going to use this data as the example instead of yours:
id Timestamp
1 1:00:00
2 1:00:03
3 1:00:06
4 1:00:10
Here is the first query to write:
SELECT a.id as aid, b.id as bid
FROM Table a
JOIN Table b
ON (a.Timestamp is within 3 seconds of b.Timestamp)
It returns:
aid bid
1 1
1 2
2 1
2 2
2 3
3 2
3 3
4 4
Let's create a nice table to hold those things that won't allow duplicates:
CREATE TABLE
Adjacency
( aid INT(11)
, bid INT(11)
, PRIMARY KEY (aid, bid) --important for later
)
Now the challenge is to find something like the transitive closure of that relation.
To do so, let's find the next level of links. by that I mean, since we have 1 2 and 2 3 in the Adjacency table, we should add 1 3:
INSERT IGNORE INTO Adjacency(aid,bid)
SELECT adj1.aid, adj2.bid
FROM Adjacency adj1
JOIN Adjacency adj2
ON (adj1.bid = adj2.aid)
This is the non-elegant part: You'll need to run the above INSERT statement over and over until it doesn't add any rows to the table. I don't know if there is a neat way to do that.
Once this is over, you will have a transitively-closed relation like this:
aid bid
1 1
1 2
1 3 --added
2 1
2 2
2 3
3 1 --added
3 2
3 3
4 4
And now for the punchline:
SELECT aid, GROUP_CONCAT( bid ) AS Neighbors
FROM Adjacency
GROUP BY aid
returns:
aid Neighbors
1 1,2,3
2 1,2,3
3 1,2,3
4 4
So
SELECT DISTINCT Neighbors
FROM (
SELECT aid, GROUP_CONCAT( bid ) AS Neighbors
FROM Adjacency
GROUP BY aid
) Groupings
returns
Neighbors
1,2,3
4
Whew!
I like #Chris Cunningham's answer, but here's another take on it.
First, my understanding of your problem statement (correct me if I'm wrong):
You want to look at your event log as a sequence, ordered by the time of the event,
and partitition it into groups, defining the boundary as being an interval of
more than 3 seconds between two adjacent rows in the sequence.
I work mostly in SQL Server, so I'm using SQL Server syntax. It shouldn't be too difficult to translate into MySQL SQL.
So, first our event log table:
--
-- our event log table
--
create table dbo.eventLog
(
id int not null ,
dtLogged datetime not null ,
title varchar(200) not null ,
primary key nonclustered ( id ) ,
unique clustered ( dtLogged , id ) ,
)
Given the above understanding of the problem statement, the following query should give you the upper and lower bounds your groups. It's a simple, nested select statement with 2 group by to collapse things:
The innermost select defines the upper bound of each group. That upper boundary defines a group.
The outer select defines the lower bound of each group.
Every row in the table should fall into one of the groups so defined, and any given group may well consist of a single date/time value.
[edited: the upper bound is the lowest date/time value where the interval is more than 3 seconds]
select dtFrom = min( t.dtFrom ) ,
dtThru = t.dtThru
from ( select dtFrom = t1.dtLogged ,
dtThru = min( t2.dtLogged )
from dbo.EventLog t1
left join dbo.EventLog t2 on t2.dtLogged >= t1.dtLogged
and datediff(second,t1.dtLogged,t2.dtLogged) > 3
group by t1.dtLogged
) t
group by t.dtThru
You could then pull rows from the event log and tag them with the group to which they belong thus:
select *
from ( select dtFrom = min( t.dtFrom ) ,
dtThru = t.dtThru
from ( select dtFrom = t1.dtLogged ,
dtThru = min( t2.dtLogged )
from dbo.EventLog t1
left join dbo.EventLog t2 on t2.dtLogged >= t1.dtLogged
and datediff(second,t1.dtLogged,t2.dtLogged) > 3
group by t1.dtLogged
) t
group by t.dtThru
) period
join dbo.EventLog t on t.dtLogged >= period.dtFrom
and t.dtLogged <= coalesce( period.dtThru , t.dtLogged )
order by period.dtFrom , period.dtThru , t.dtLogged
Each row is tagged with its group via the dtFrom and dtThru columns returned. You could get fancy and assign an integral row number to each group if you want.
Simple query:
SELECT * FROM time_history GROUP BY ROUND(UNIX_TIMESTAMP(time_stamp)/3);
Data:
values date
14 1.1.2010
20 1.1.2010
10 2.1.2010
7 4.1.2010
...
sample query about january 2010 should get 31 rows. One for every day. And values vould be added. Right now I could do this with 31 queries but I would like this to work with one. Is it possible?
results:
1. 34
2. 10
3. 0
4. 7
...
This is actually surprisingly difficult to do in SQL. One way to do it is to have a long select statement with UNION ALLs to generate the numbers from 1 to 31. This demonstrates the principle but I stopped at 4 for clarity:
SELECT MonthDate.Date, COALESCE(SUM(`values`), 0) AS Total
FROM (
SELECT 1 AS Date UNION ALL
SELECT 2 UNION ALL
SELECT 3 UNION ALL
SELECT 4 UNION ALL
--
SELECT 28 UNION ALL
SELECT 29 UNION ALL
SELECT 30 UNION ALL
SELECT 31) AS MonthDate
LEFT JOIN Table1 AS T1
ON MonthDate.Date = DAY(T1.Date)
AND MONTH(T1.Date) = 1 AND YEAR(T1.Date) = 2010
WHERE MonthDate.Date <= DAY(LAST_DAY('2010-01-01'))
GROUP BY MonthDate.Date
It might be better to use a table to store these values and join with it instead.
Result:
1, 34
2, 10
3, 0
4, 7
Given that for some dates you have no data, you'll need to fill in the gaps. One approach to this is to have a calendar table prefilled with all dates you need, and join against that.
If you want the results to show day numbers as you have showing in your question, you could prepopulate these in your calendar too as labels.
You would join your data table date field to the date field of the calendar table, group by that field, and sum values. You might want to specify limits for the range of dates covered.
So you might have:
CREATE TABLE Calendar (
label varchar,
cal_date date,
primary key ( cal_date )
)
Query:
SELECT
c.label,
SUM( d.values )
FROM
Calendar c
JOIN
Data_table d
ON d.date_field = c.cal_date
WHERE
c.cal_date BETWEEN '2010-01-01' AND '2010-01-31'
GROUP BY
d.date_field
ORDER BY
d.date_field
Update:
I see you have datetimes rather than dates. You could just use the MySQL DATE() function in the join, but that would probably not be optimal. Another approach would be to have start and end times in the Calendar table defining a 'time bucket' for each day.
This works for me... Its a modification of a query I found on another site. The "INTERVAL 1 MONTH" clause ensures I get the current month data, including zeros for days that have no hits. Change this to "INTERVAL 2 MONTH" to get last months data, etc.
I have a table called "payload" with a column "timestamp" - Im then joining the timestamp column on to the dynamically generated dates, casting it so that the dates match in the ON clause.
SELECT `calendarday`,COUNT(P.`timestamp`) AS `cnt` FROM
(SELECT #tmpdate := DATE_ADD(#tmpdate, INTERVAL 1 DAY) `calendarday`
FROM (SELECT #tmpdate :=
LAST_DAY(DATE_SUB(CURDATE(),INTERVAL 1 MONTH)))
AS `dynamic`, `payload`) AS `calendar`
LEFT JOIN `payload` P ON DATE(P.`timestamp`) = `calendarday`
GROUP BY `calendarday`
To dynamically get the dates within a date range using SQL you can do this (example in mysql):
Create a table to hold the numbers 0 through 9.
CREATE TABLE ints ( i tinyint(4) );
insert into ints (i)
values (0),(1),(2),(3),(4),(5),(6),(7),(8),(9);
Run a query like so:
select ((curdate() - interval 2 year) + interval (t.i * 100 + u.i * 10 + v.i) day) AS Date
from
ints t
join ints u
join ints v
having Date between '2015-01-01' and '2015-05-01'
order by t.i, u.i, v.i
This will generate all dates between Jan 1, 2015 and May 1, 2015.
Output
2015-01-01
2015-01-02
2015-01-03
2015-01-04
2015-01-05
2015-01-06
...
2015-05-01
The query joins the table ints 3 times and gets an incrementing number (0 through 999). It then adds this number as a day interval starting from a certain date, in this case a date 2 years ago. Any date range from 2 years ago and 1,000 days ahead can be obtained with the example above.
To generate a query that generates dates for more than 1,000 days simply join the ints table once more to allow for up to 10,000 days of range, and so forth.
If I'm understanding the rather vague question correctly, you want to know the number of records for each date within a month. If that's true, here's how you can do it:
SELECT COUNT(value_column) FROM table WHERE date_column LIKE '2010-01-%' GROUP BY date_column
I need to get records with different date field ,
table Sites:
field id
reference
created
Every day we add lot of records, so I need to do a function that extract all records existing with duplicates of rows just was added, to do some notifications.
the conditions that i can't get is the difference between records of the current day and the old data in the table should be (one day to 4 days) .
If is there any simple query to do that without using transaction .
I'm not sure I totally understand what you mean by duplicate records, but here's a basic date query:
SELECT fieldId, reference, created, DATE(created) as the_date
FROM Sites
WHERE the_date
BETWEEN DATE( DATE_SUB( NOW() , INTERVAL 3 DAY ) )
AND DATE ( NOW() )
I'm making several assumptions such as:
You don't want the "first" row returned
Duplicates don't carry the
date forward (The next after initial 4 days is not a duplicate)
The 4 days means +4 days so Day 5 is included
So, my code is :
with originals as (
select s1.*
from sites as s1
where 0 = (
select count(*)
from sites as s2
where s1.field_id = s2.field_id
and s1.reference = s2.reference
and s1.created <> s2.created
and DATEDIFF(DAY,s2.created, s1.created) between 1 and 4
)
)
select s1.*
from sites as s1
inner join originals as o
on s1.field_id = o.field_id
and s1.reference = o.reference
and s1.created <> o.created
where DATEDIFF(DAY,o.created, s1.created) between 1 and 4
order by 1,2,3;
Here it is in a fiddle: http://sqlfiddle.com/#!3/9b407/20
This could be simpler if some conditions are relaxed.
thanks a lot for every one who tried to help me ,
i have found this solution after lot of test
SELECT `id`,`reference`,count(`config_id`) as c,`created` FROM `sites`
where datediff(date(current_date()),date(`created`)) < 4
group by `reference`
having c > 1
thanks a lot for your help