I have a table (job_logs) with the following records:
id, job_id, user_id, status, created_at, job_type.
Each time a job starts to run a record is written in the job_log table with status='started'. When a job finish running another record is added to the table with status='completed'.
Both records has the same user_id, job_type and job_id (which is determined by the process running the job - unique to these 2 records).
I want a query that will return all these records pairs in the table (ordered by id desc) but the tricky part is that I want to add to the record with the 'completed' status the time it took the job to run (completed.created_at - started.created_at).
How can I do that?
SELECT j1.job_id AS job_id, (j2.created_at - j1.created_at) AS time_run
FROM job_logs j1 INNER JOIN job_logs j2 ON (j1.job_id = j2.job_id)
WHERE j1.status = 'started' AND j2.status = 'completed'
Related
I am trying to get a selection of data out of my database but am having trouble, I'm sure its something simple that I am not seeing but I cant figure it out.
A table, jobs, has 5 fields: job_id, job_status, job_schedulestatus, job_schedulestatus2, job_schedulestatus3. The id is a auto incremented number, the status can have two values Active or Invoiced and the schedule status can each hold a large number of values that are selected from a drop down but in this case I just want to focus on values called FOC, Cancelled and Sample.
What I am trying to do is select all values that are set to active and don't have FOC, Cancelled or Sample set in the schedule status 1-3
Here is my select statement:
SELECT job_id, job_status, job_schedulestatus, job_schedulestatus2, job_schedulestatus3
FROM jobs WHERE job_status='Active'
AND ( job_schedulestatus!='FOC' OR job_schedulestatus!='Cancelled' OR job_schedulestatus!='Sample' )
OR ( job_schedulestatus2!='FOC' OR job_schedulestatus2!='Cancelled' OR job_schedulestatus2!='Sample' )
OR ( job_schedulestatus3!='FOC' OR job_schedulestatus3!='Cancelled' OR job_schedulestatus3!='Sample' )
ORDER BY job_id DESC;
This still shows all fields that have FOC, Cancelled or Sample. Now if I remove the != and replace with just = it will only show those with FOC, Cancelled or Sample which suggests to me that there is an issue using the !=. I tried replcaing with <> but still doesn't work.
If I just test it with one check on the schedule status it works as below:
SELECT job_id, job_status, job_schedulestatus, job_schedulestatus2, job_schedulestatus3
FROM jobs WHERE job_status='Active' AND job_schedulestatus!='Cancelled
ORDER BY job_id DESC;
Any Ideas?
Thanks in advance
This seems easier to process:
SELECT job_id
, job_status
, job_schedulestatus
, job_schedulestatus2
, job_schedulestatus3
FROM jobs
WHERE job_status = 'Active'
AND
( job_schedulestatus NOT IN ('FOC','Cancelled','Sample')
OR job_schedulestatus2 NOT IN ('FOC','Cancelled','Sample')
OR job_schedulestatus3 NOT IN ('FOC','Cancelled','Sample')
)
ORDER
BY job_id DESC;
Note that generally, where you find yourself with enumerated columns (above, say, 2) you can be confident that your schema design is sub-optimal.
SELECT job_id, job_status, job_schedulestatus, job_schedulestatus2, job_schedulestatus3
FROM jobs WHERE job_status='Active'
AND (job_schedulestatus NOT IN ('FOC','Cancelled','Sample')
OR job_schedulestatus2 NOT IN ('FOC','Cancelled','Sample')
OR job_schedulestatus3 NOT IN ('FOC','Cancelled','Sample'))
ORDER BY job_id DESC;
I have two tables as Table A and Table B.Table A contains few records with same emp_id and date as shown in below.but time column has different values.Now i wants to insert these two records as one record to the Table B.expected output of Table B is shown Below.
Table A
Table B [expected output]
This would be the sort of logic that you need to display that result. Not sure it will work exactly without testing it with your particular tables etc. This will also select the data and display it like that but won't insert it (don't know why you would need that)
SELECT
id,
emp_id
date,
MAX(case when row = 1 then time end) in1,
MAX(case when row = 2 then time end) in2
FROM
(
SELECT id, emp_id, date, time
row_number() over(partition by id) row
FROM TableA
) a
GROUP BY date;
I have a table that stores simple log data:
CREATE TABLE chronicle (
id INT auto_increment PRIMARY KEY,
data1 VARCHAR(256),
data2 VARCHAR(256),
time DATETIME
);
The table is approaching 1m records, so I'd like to start consolidating data.
I want to be able to take the first and last record of each DISTINCT(data1, data2) each day and delete all the rest.
I know how to just pull in the data and process it in whatever language I want then delete the records with a huge IN (...) query, but it seems like a better alternative would to use SQL directly (am I wrong?)
I have tried several queries, but I'm not very good with SQL beyond JOINs.
Here is what I have so far:
SELECT id, Max(time), Min(time)
FROM (SELECT id, data1 ,data2, time, Cast(time AS DATE) AS day
FROM chronicle) AS initial
GROUP BY day;
This gets me the first and last time for each day, but it's not separated out by the data (i.e. I get the last record of each day, not the last record for each distinct set of data for each day.) Additionally, the id is just for the Min(time).
The information I've found on this particular problem is only for finding the the last record of the day, not each last record for sets of data.
IMPORTANT: I want the first/last record for each DISTINCT(data1, data2) for each day, not just the first/last record for each day in the table. There will be more than 2 records for each day.
Solution:
My solution thanks to Jonathan Dahan and Gordon Linoff:
SELECT o.data1, o.data2, o.time FROM chronicle AS o JOIN (
SELECT Min(id) as id FROM chronicle GROUP BY DATE(time), data1, data2
UNION SELECT Max(id) as id FROM test_chronicle GROUP BY DATE(time), data1. data2
) AS n ON o.id = n.id;
From here it's a simple matter of referencing the same table to delete rows.
this will improve performance when searching on dates.
ALTER TABLE chronicle
ADD INDEX `ix_chronicle_time` (`time` ASC);
This will delete the records:
CREATE TEMPORARY TABLE #tmp_ids (
`id` INT NOT NULL,
PRIMARY KEY (`id`)
);
INSERT INTO #tmp_ids (id)
SELECT
min(id)
FROM
chronicle
GROUP BY
CAST(day as DATE),
data1,
data2
UNION
SELECT
Max(id)
FROM
chronicle
GROUP BY
CAST(day as DATE),
data1,
data2;
DELETE FROM
chronicle
WHERE
ID not in (select id FROM #tmp_ids)
AND date <= '2015-01-01'; -- if you want to consider all dates, then remove this condition
You have the right idea. You just need to join back to get the original information.
SELECT c.*
FROM chronicle c JOIN
(SELECT date(time) as day, min(time) as mint, max(time) as maxt
FROM chronicle
GROUP BY date(time)
) cc
ON c.time IN (cc.mint, cc.maxt);
Note that the join condition doesn't need to include day explicitly because it is part of the time. Of course, you could add date(c.time) = cc.day if you wanted to.
Instead of deleting rows in your original table, I would suggest that you make a new table. Something lie this:
create table ChronicleByDay like chronicle;
insert into ChronicleByDay
SELECT c.*
FROM chronicle c JOIN
(SELECT date(time) as day, min(time) as mint, max(time) as maxt
FROM chronicle
GROUP BY date(time)
) cc
ON c.time IN (cc.mint, cc.maxt);
That way, you can have the more detailed information if you ever need it.
This query works and provides me with the information I need, but it is very slow: it takes 18 seconds to agregate a database of only 4,000 records.
I'm bringing it here to see if anyone has any advice on how to improve it.
SELECT COUNT( status ) AS quantity, status
FROM log_table
WHERE time_stamp
IN (SELECT MAX( time_stamp ) FROM log_table GROUP BY userid )
GROUP BY status
Here's what it does/what it needs to do in plain text:
I have a table full of logs, each log contains a "userid", "status" (integer between 1-12) and "time_stamp" (a time stamp of when the log was created). There may be many entries for a particular userid, but with a different time stamp and status. I'm trying to get the most recent status (based on time_stamp) for each userid, then count the occurrences of each most-recent status among all the users.
My initial idea was to use a sub query with GROUP BY userid, that worked fast - but that always returned the first entry for each userid, not the most recent. If I could do GROUP BY userid using time_stamp DESC to Identify which row should be the representative for the group, that would be great. But of course ORDER BY inside of group does not work.
Any suggestions?
The first thing to try is to make this an explicit join:
SELECT COUNT(status) AS quantity, status
FROM log_table join
(select lg.userid, MAX( time_stamp ) as maxts
from log_table lg
GROUP BY userid
) lgu
on lgu.userid = lg.userid and lgu.maxts = lg.time_stamp
GROUP BY status;
Another approach is to use a different where clause. This will work best if you have an index on log_table(userid, time_stamp). This approach is doing the filtering by saying "there is no timestamp bigger than this one for a given user":
SELECT COUNT(status) AS quantity, status
FROM log_table
WHERE not exists (select 1
from log_table lg2
where lgu.userid = lg.userid and lg2.time_stamp > lg.time_stamp
)
GROUP BY status;
I have the following table:
I'm trying to find a way to get the records for those customers that have expired, and then update the table accordingly (by update I mean add an a new record with entry 'SERVICE EXPIRED' with the customer_id of the relevant customer).
If you look at the bottom of the table, you will notice two records with the entry 'SERVICE EXPIRED' for already existing customers (customer_id 11 and 16).
I'm looking for a SQL Query that will:
Get the last set of distinct records by customer_id
Exclude records for the same customer_id from the resulting resultset that have the entry 'SERVICE EXPIRED' or status_id of 2 appearing later on in the table
If I use the following:
SELECT MAX(id) FROM mytable WHERE status_id != '2' AND expiry < '2012-12-26 19:00:00' GROUP BY customer_id
It will return ids 1, 11, 13, and 16. However, I don't want ids 11 and 16 because the expiry status has already been noted later on in the table (see the last two records of the table), and id 1 has been renewed as can be seen with an updated expiry date in id 3 later. All I want is id 13 because that is the only expired record that does not have a 'SERVICE EXPIRED' entry that appears later in the table.
I'm looking for a SQL Query that will enable me capture this requirement.
Thanks in advance
After some fiddling around I managed to come up with a solution:
SELECT MAX(id)
FROM mytable
WHERE status_id != '2'
AND expiry < '2012-12-26 19:00:00'
AND customer_id NOT IN (SELECT MAX(customer_id) FROM mytable WHERE status_id = '2' GROUP BY customer_id)
GROUP BY customer_id
Thanks #JupiterP5 for pointing me in the right direction.
Regards,
Your requirement is equivalent to finding "n" records after the last expiry on a record. The following query returns all records after the last expiry for a given customer:
select t.*
from t join
(select t.customer_id, MAX(id) as maxid
from t
where status_id = 2
) texp
on t.customer_id = texp.customer_id and
t.id > texp.maxid
By using variables cleverly, you can enumerate these to get the last "n". However, do you really need a fixed number? Why not all of them? Why not just one of them?
It's not efficient, but this should work.
SELECT MAX(id)
FROM mytable
WHERE status_id != '2'
AND expiry < '2012-12-26 19:00:00'
AND id NOT IN (SELECT id FROM mytable where status_id = 2)
GROUP BY customer_id
Edit: Missed the service renewed case. I'll update if I think of something.