I have two log tables that I would like to link, but the entries made in each table are not done at exactly the same time and the time difference varies but should always be within a second.
To keep it simple, let's say table A looks like:
ItemId int
Comment varchar(50)
LogTime datetime
and let's say that table B has the exact same structure.
Suppose these records are in Table A:
ItemId Comment LogTime
-----------------------------------
100 Test100-A1 12:00:00.00
200 Test200-A 12:00:03.50
100 Test100-A2 12:00:06.30
and these are in Table B
ItemId Comment LogTime
-----------------------------------
100 Test100-B1 12:00:00.03
200 Test200-B 12:00:02.98
100 Test100-B2 12:00:06.53
And I'd like to have the following output
A.ItemId A.Comment A.LogTime B.ItemId B.Comment B.LogTime
-------------------------------------------------------------------------
100 Test100-A1 12:00:00.00 100 Test100-B1 12:00:00.03
200 Test200-A 12:00:03.50 200 Test200-B 12:00:02.98
100 Test100-A2 12:00:06.30 100 Test100-B2 12:00:06.53
How can I create a query that will link the two tables together this way on the ItemId and LogTime, but with up to a 1 second variation in either direction for the LogTime?
I figured it out... was actually a bit simpler than I realized.
select *
from A left join
B on A.ItemId = B.ItemId and
abs(DATEDIFF(ss, A.LogTime, B.LogTime)) <= 1
I tried doing it based on milliseconds instead of seconds the first time but that was giving me an overflow error when comparing dates that were too far apart. I'd rather do milliseconds though so I can narrow it down to a little less than a second but not sure what the best way to accomplish that just yet. Maybe I could use a case statement. If someone else wants to post an answer that does it I'll mark it or else I'll come back later and update my answer to work off milliseconds when I get a chance.
Related
I've a table with a structure something like this,
Device | paid | time
abc 1 2 days ago
abc 0 1 day ago
abc 0 5 mins ago
Is it possible to write a query that checks the paid column on all the rows where Device = abc and then outputs the most recent two rows that different. Basically, something like an if statement saying if row 1 = 1 and row 2 = 0 output that but only if it's the most recent two columns that are different. For example, in this case, the first and second row. The table is being updated whenever a user changes from a free to paid account etc. It is also updated in different columns for different reasons hence the duplicate 0s for example.
I know this would probably be done better by having another table altogether and updating that every time the user switches account type, but is there any way to make this work?
Thanks
Example:
http://rextester.com/MABU7860 need further testing on edge cases but this seems to work.
SELECT A.*, B.*
FROM SQLfoo A
INNER JOIN SQLFoo B
on A.Device = B.Device
and A.mTime < B.mTime
WHERE A.Paid <> B.Paid
and A.device = 'abc'
ORDER BY B.mTime Desc, A.MTime Desc
LIMIT 1
By performing a self join we on the devices where the time from one table is less than the time from the next table (thus the two records will never matach and we only get the reuslts one way) and we order by those times descending, the highest times appear first in the result since we limit by a single device we don't need to concern ourselves with the devices. We then just need compare the paid from one source to the paid in the 2nd source and return the first result encountered thus limit 1.
Or using user variables
http://rextester.com/TWVEVX7830
in other engines one might accomplish this task by performing the join as in above, assigning a row number partitioned by the device and then simply return all those row_numbers with a value of 1; which would be the earliest date discrepency.
Use LIMIT to limit the number of record on mysql:
http://www.mysqltutorial.org/mysql-limit.aspx
In your case, use LIMIT 2
and then put the 2 record that you just select into an array, then compare the array if the value is different. If they are different then print
I would really appreciate some help with a MySQL query for the following matter.
My table, let's call it "data", contains the following fields: "timestamp" and "temperature".
Every 30 seconds a new record is being added into it.
My goal is to identify the record (timestamp) which compared to the one added 2 minutes later (4 records later) has a temperature difference of 20 degrees (or more)
Ex.
...
19:14:08 99
19:14:38 100
19:15:08 101
19:15:38 105
19:16:08 115
19:16:38 126
19:17:08 150
19:17:38 151
...
In this case, the timestamp which I have to find is 19:14:38, because if compared to the one at 19:16:38, we have 126-100 = 26 > 20.
There are some other conditions (not worth mentioning) which have to be met as well, but at least those I can handle myself.
Thanks for your help.
If your timestamps are exactly, you can use a self-join:
select t.*
from t join
t tnext
on t.timestamp = tnext.timestamp - interval 2 minutes
where tnext.temperature - t.temperature > 20;
This is highly dependent on the accuracy of your timestamps, however.
So I believe you're on the right track with a join to the same table: something like this should get you started. It's untested air-code and typically I write in oracle sql so pardon any syntax nuances...
SELECT
a.TEMPERATURE AS NEW_TEMP
,b.TEMPERATURE AS PRIOR_TEMP
FROM
DATA a
INNER JOIN DATA b ON
b.TIMESTAMP = a.TIMESTAMP-TO_DATE('02','MM')
AND ABS(ABS(a.TEMPERATURE) - ABS(b.TEMPERATURE)) > 20
Additionally - using a timestamp is probably not as reliable as you might think since there could be variation that you do not want to exist (such as the timestamp may be off by a second ie it is exactly 1:59 seconds prior to the new record. in which case this join would miss it. whereas if you were using an autoincremented ID as suggested above, you could simply replace that first join clause with:
b.RECORD_ID = a.RECORD_ID-4
I think it will be easiest to start with the table I have and the result I am aiming for.
Name | Date
A | 03/01/2012
A | 03/01/2012
B | 02/01/2012
A | 02/01/2012
B | 02/01/2012
A | 02/01/2012
B | 01/01/2012
B | 01/01/2012
A | 01/01/2012
I want the result of my query to be:
Name | 01/01/2012 | 02/01/2012 | 03/01/2012
A | 1 | 2 | 2
B | 2 | 2 | 0
So basically I want to count the number of rows that have the same date, but for each individual name. So a simple group by of dates won't do because it would merge the names together. And then I want to output a table that shows the counts for each individual date using php.
I've seen answers suggest something like this:
SELECT
NAME,
SUM(CASE WHEN GRADE = 1 THEN 1 ELSE 0 END) AS GRADE1,
SUM(CASE WHEN GRADE = 2 THEN 1 ELSE 0 END) AS GRADE2,
SUM(CASE WHEN GRADE = 3 THEN 1 ELSE 0 END) AS GRADE3
FROM Rodzaj
GROUP BY NAME
so I imagine there would be a way for me to tweak that but I was wondering if there is another way, or is that the most efficient?
I was perhaps thinking if the while loop were to output just one specific name and date each time along with the count, so the first result would be A,01/01/2012,1 then the next A,02/01/2012,2 - A,03/01/2012,3 - B,01/01/2012,2 etc. then perhaps that would be doable through a different technique but not sure if something like that is possible and if it would be efficient.
So I'm basically looking to see if anyone has any ideas that are a bit outside the box for this and how they would compare.
I hope I explained everything well enough and thanks in advance for any help.
You have to include two columns in your GROUP BY:
SELECT name, COUNT(*) AS count
FROM your_table
GROUP BY name, date
This will get the counts of each name -> date combination in row-format. Since you also wanted to include a 0 count if the name didn't have any rows on a certain date, you can use:
SELECT a.name,
b.date,
COUNT(c.name) AS date_count
FROM (SELECT DISTINCT name FROM your_table) a
CROSS JOIN (SELECT DISTINCT date FROM your_table) b
LEFT JOIN your_table c ON a.name = c.name AND
b.date = c.date
GROUP BY a.name,
b.date
SQLFiddle Demo
You're asking for a "pivot". Basically, it is what it is. The real problem with a pivot is that the column names must adapt to the data, which is impossible to do with SQL alone.
Here's how you do it:
SELECT
Name,
SUM(`Date` = '01/01/2012') AS `01/01/2012`,
SUM(`Date` = '02/01/2012') AS `02/01/2012`,
SUM(`Date` = '03/01/2012') AS `03/01/2012`
FROM mytable
GROUP BY Name
Note the cool way you can SUM() a condition in mysql, becasue in mysql true is 1 and false is 0, so summing a condition is equivalent to counting the number of times it's true.
It is not more efficient to use an inner group by first.
Just in case anyone is interested in what was the best method:
Zane's second suggestion was the slowest, I loaded in a third of the data I did for the other two and it took quite a while. Perhaps on smaller tables it would be more efficient, and although I am not working with a huge table roughly 28,000 rows was enough to create significant lag, with the between clause dropping the result to about 4000 rows.
Bohemian's answer gave me the least amount to code, I threw in a loop to create all the case statements and it worked with relative ease. The benefit of this method was the simplicity, besides creating the loop for the cases, the results come in without the need for any php tricks, just simple foreach to get all the columns. Recommended for those not confident with php.
However, I found Zane's first suggestion the quickest performing and despite the need for extra php coding it seems I will be sticking with this method. The disadvantage of this method is that it only gives the dates that actually have data, so creating a table with all the dates becomes a bit more complicated. What I did was create a variable that keeps track of what date it is supposed to be compared to the table column which is reset on each table row, when the result of the query is equal to that date it echoes the value otherwise it does a while loop echoing table cells with 0 until the dates do match. It also had to do a check to see if the 'Name' value is still the same and if not it would switch to the next row after filling in any missing cells with 0 to the end of that row. If anyone is interested in seeing the code you can message me.
Results of the two methods over 3 months of data (a column for each day so roughly 90 case statements) ~ 12,000 rows out of 28,000:Bohemian's Pivot - ~0.158s (highest seen ~0.36s)Zane's Double Group by - ~0.086s (highest seen ~0.15s)
I would like to show a filtered result to a few ip's that keep scraping my content. I have blocked them with .htaccess and they change their ip address and continue doing it. So I thought, I want to create a soft block that won't show them all of my content and hopefully they won't even notice.
My table has a auto_increment field
id | category | everything else
1 1
2 1
3 4
4 2
I have been trying something like this.
SELECT * from mytable WHERE `category` = '1' having avg(id/3) = 1 ORDER BY `id` DESC LIMIT 0 , 10
I have searched forever but I am a newb to sql, so I don't even really know what I am searching for. I hope somebody here can please help me! Thanks :)
If you want to get remainder of division by 3, you should use % operator.
SELECT * from mytable WHERE `category` = '1' and id % 3 = 1 ORDER BY `id`
DESC LIMIT 0 , 10
Generally, the ID column is not for doing computations on it. It does not represent anything other than unique identifier of the record (at most, it should be used to sort the records chronologically) - you could have there GUIDs for example, and your application should work.
If you want to store the IPs that you want to block in your DB, consider adding another column to your table, call it status or something similar, and store in this column the status for that ip - this status could be clean, suspicious, blocked, etc. After that, your SELECT should look only after the rows with blocked status
I have a MySQL database where one column contains status codes. The column is of type int and the values will only ever be 100,200,300,400. It looks like below; other columns removed for clarity.
id | status
----------------
1 300
2 100
3 100
4 200
5 300
6 300
7 100
8 400
9 200
10 300
11 100
12 400
13 400
14 400
15 300
16 300
The id field is auto-generated and will always be sequential. I want to have a third column displaying a comma-separated string of the frequency distribution of the status codes of the previous 10 rows. It should look like this.
id | status | freq
-----------------------------------
1 300
2 100
3 100
4 200
5 200
6 300
7 100
8 400
9 300
10 300
11 100 300,100,200,400 -- from rows 1-10
12 400 100,300,200,400 -- from rows 2-11
13 400 100,300,200,400 -- from rows 3-12
14 400 300,400,100,200 -- from rows 4-13
15 300 400,300,100,200 -- from rows 5-14
16 300 300,400,100 -- from rows 6-15
I want the most frequent code listed first. And where two status codes have the same frequency it doesn't matter to me which is listed first but I did list the smaller code before the larger in the example. Lastly, where a code doesn't appear at all in the previous ten rows, it shouldn't be listed in the freq column either.
And to be very clear the row number that the frequency string appears on does NOT take into account the status code of that row; it's only the previous rows.
So what have I done? I'm pretty green with SQL. I'm a programmer and I find this SQL language a tad odd to get used to. I managed the following self-join select statement.
select *, avg(b.status) freq
from sample a
join sample b
on (b.id < a.id) and (b.id > a.id - 11)
where a.id > 10
group by a.id;
Using the aggregate function avg, I can at least demonstrate the concept. The derived table b provides the correct rows to the avg function but I just can't figure out the multi-step process of counting and grouping rows from b to get a frequency distribution and then collapse the frequency rows into a single string value.
Also I've tried using standard stored functions and procedures in place of the built-in aggregate functions, but it seems the b derived table is out of scope or something. I can't seem to access it. And from what I understand writing a custom aggregate function is not possible for me as it seems to require developing in C, something I'm not trained for.
Here's sql to load up the sample.
create table sample (
id int NOT NULL AUTO_INCREMENT,
PRIMARY KEY(id),
status int
);
insert into sample(status) values(300),(100),(100),(200),(200),(300)
,(100),(400),(300),(300),(100),(400),(400),(400),(300),(300),(300)
,(100),(400),(100),(100),(200),(500),(300),(100),(400),(200),(100)
,(500),(300);
The sample has 30 rows of data to work with. I know it's a long question, but I just wanted to be as detailed as I could be. I've worked on this for a few days now and would really like to get it done.
Thanks for your help.
The only way I know of to do what you're asking is to use a BEFORE INSERT trigger. It has to be BEFORE INSERT because you want to update a value in the row being inserted, which can only be done in a BEFORE trigger. Unfortunately, that also means it won't have been assigned an ID yet, so hopefully it's safe to assume that at the time a new record is inserted, the last 10 records in the table are the ones you're interested in. Your trigger will need to get the values of the last 10 ID's and use the GROUP_CONCAT function to join them into a single string, ordered by the COUNT. I've been using SQL Server mostly and I don't have access to a MySQL server at the moment to test this, but hopefully my syntax will be close enough to at least get you moving in the right direction:
create trigger sample_trigger BEFORE INSERT ON sample
FOR EACH ROW
BEGIN
DECLARE _freq varchar(50);
SELECT GROUP_CONCAT(tbl.status ORDER BY tbl.Occurrences) INTO _freq
FROM (SELECT status, COUNT(*) AS Occurrences, 1 AS grp FROM sample ORDER BY id DESC LIMIT 10) AS tbl
GROUP BY tbl.grp
SET new.freq = _freq;
END
SELECT id, GROUP_CONCAT(status ORDER BY freq desc) FROM
(SELECT a.id as id, b.status, COUNT(*) as freq
FROM
sample a
JOIN
sample b ON (b.id < a.id) AND (b.id > a.id - 11)
WHERE
a.id > 10
GROUP BY a.id, b.status) AS sub
GROUP BY id;
SQL Fiddle