Query to find entries and transpose - sql-server-2008

I've got a machine log available in an SQL table. I can do a bit in SQL, but I'm not good enough to process the following:
In the data column there are entries containing "RUNPGM: Recipe name" and "RUNBRKPGM: Recipe name"
What I want is a view containing 4 columns:
TimeStamp RUNPGM
TimeStamp RUNBRKPGM
Recipe Name
Time Difference in seconds
There is a bit of a catch:
Sometimes the machine logs an empty RUNBRKPGM that should be ignored
The RUNBRKPGM is sometimes logged with an error message. This entry should also be ignored.
It's always the RUNBRKPGM entry with just the recipe name that's the actual end of the recipe.

NOTE: I understand this is not a full/complete answer, but with info available in question as of now, I believe it at least helps give a starting point since this is too complicated (and formatted) to put in the comments:
If Recipe is everything in the DATA field except the 'RUNPGM = ' part you can do somethign similar to this:
SELECT
-- will give you a col for TimeStamp for records with RUNPGM
CASE WHEN DATA LIKE 'RUNPGM%' THEN TS ELSE '' END AS RUNPGM_TimeStamp,
-- will give you a col for TimeStamp for records with RUNBRKPGM
CASE WHEN DATA LIKE 'RUNBRKPGM%' THEN TS ELSE '' END AS RUNBRKPGM_TimeStamp,
-- will give you everything after the RUNPGM = (which I think is the recipe you are referring to)
CASE WHEN DATA LIKE 'RUNPGM%' THEN REPLACE(DATA, 'RUNPGM = ', '' AS RUNPGM_Recipe,
-- will give you everything after the RUNBRKPGM = (which I think is the recipe you are referring to)
CASE WHEN DATA LIKE 'RUNBRKPGM:%' THEN REPLACE(DATA, 'RUNBRKPGM = ', '' AS RUNPGM_Recipe
FROM TableName
Im not sure what columns you want to get the Time Difference on though so I dont have that column in here.
Then if you need to do additional logic/formatting on the columns once they are separated you can put the above in a sub select.

As a first swing, I'd try the following:
Create a view that uses string splitting to break the DATA column into a its parts (e.g. RunType and RecipeName)
Create a simple select that outputs the recipe name and tstamp where the runtype is RUNPGM.
Then add an OUTER APPLY:
Essentially, joining onto itself.
SELECT
t1.RecipeName,
t1.TimeStamp AS Start,
t2.TimeStamp AS Stop
--date func to get run time, pseudo DATEDIFF(xx,t1.TimeStamp, t2.TimeStamp) as RunTime
FROM newView t1
OUTER APPLY ( SELECT TOP ( 1 ) *
FROM newView x
WHERE x.RecipeName = t1.RecipeName
AND RunType = 'RUNBRKPGM'
ORDER BY ID DESC ) t2
WHERE t1.RunType = 'RUNPGM';

Related

MySQL - Separating data within 1 column into 3 separate columns for a report

We have a report to track how many edits our sales reps are doing. The current query to pull the number of edits on all 3 pages is below. We didn't care before which page they were making edits on, but now we want to see which pages they are make those edits on.
We are wanting to have 3 different columns: bhns, hns, chns, show up on the report and need to modify this query to show the different columns. So, split the 1 column (customer_edits) into 3 columns base on page.
SELECT
count( `database2`.`sales_edits`.`id` ) AS `customer_edits`,
`database2`.`sales_edits`.`rep` AS `rep`
FROM
`database2`.`sales_edits`
WHERE
((
cast( `database2`.`sales_edits`.`date` AS date ) = curdate())
AND ((
`database2`.`sales_edits`.`page` = 'chs'
)
OR ( `database2`.`sales_edits`.`page` = 'chns' )
OR ( `database2`.`sales_edits`.`page` = 'bhns' )))
GROUP BY
`database2`.`sales_edits`.`rep`
sales_edit table:
It looks like you want conditional aggregation:
select
rep,
sum(page = 'chs') customer_edits_chs,
sum(page = 'chns') customer_edits_chns,
sum(page = 'bhns') customer_edits_bhns
from database2.sales_edits
where date = current_date and page in ('chs', 'chns', 'bhns')
group by rep
Your original code looks more complicated that it needs to:
no need to prefix all columns with the schema and table name - a single table comes into play anyway (and if you had more than one, then you should use table aliases to shorten the code)
the date casting seems unecessary; MySQL happily understands any string in 'yyyy-mm-dd' format as a date
the repeated or conditions can be shortened with in
it is probably unneeded to surround all identifiers with backticks, while they do not contain special characters

MySQL: how can i push data onto existing column data?

si have a db field that i want to use to track intervals. i want to push completed intervals onto the db field when they are completed. to wit:
intervals = '10'
intervals = '1020' <-- pushing 20 onto the field
intervals = '102040' <-- pushing 40 onto the field
intervals = '102040SP' <-- pushing SP onto the field
the values will never decrement (and order doesn't really matter, if that's a factor), so i'm only looking for a way to UPDATE the field, but i have no idea how to do that because UPDATE tbl SET ... just overwrites the existing contents. i looked into CONCAT, but that works with variables the user provides, not with existing data AND additional user data. if i were to write some PSEUDO code, it might look like this:
UPDATE tbl PUSHTO intervals VALUE newInterval WHERE id='id' AND date='date'
so. can anybody help me out here? there has to be a way to do this. :)
An update with concatenation is what you want here:
UPDATE tbl
SET interval = CONCAT(interval, newInterval)
WHERE id = 'id' AND date = 'date';
If you wanted to make the update even in the event that newInterval might be null, you could try:
UPDATE tbl
SET interval = CONCAT(interval, COALESCE(newInterval, ''))
WHERE id = 'id' AND date = 'date';

Sql: Find sum of column from second table using date from first table

I've been struggling to build a query that calculate the sum of column called 'TIDAL_VOLUME' with respect to date value that's coming from another table.
Please see the content of the Table_1:
Please see the content of the Table_2:
Note: TIDAL_VOLUME might have NULL as well.
Now, the start time for O2_Device value 'Endotracheal tube' is '2013-08-06 08:10:05' for same HADM_ID and SUBJECT_ID. and end time is whenever new O2_Device value comes in. In this case which is 'Nasal cannula'. Which means start time for 'Endotracheal tube' is '2013-08-06 08:10:05' and end time is '2013-08-06 10:15:05' for HADM_ID = 1 and SUBJECT_ID = 100.
Using that start time and end time criteria, I have to look for TIDAL_VALUE in Table_2. In this example it's 700, 800. Ans for TIDAL_VOLUME is 1500.
Please see the resultant output look like this:
Thanks in advance.
If you can add End_Time to the first table, you can use BETWEEN when you join the tables.
SELECT t1.HADM_ID, t1.Subject_ID, t1.ChartTime, SUM(t2.tidal_volume) AS tidal_volume
FROM Table_1 AS t1
JOIN Table_2 AS t2
ON t1.HADM_ID = t2.HADM_ID
AND t1.Subject_ID = t2.Subject_ID
AND t2.ChartTime BETWEEN t1.ChartTime AND t1.End_Time
GROUP BY t1.HADM_ID, t1.Subject_ID, t1.ChartTime

Need help in writing Efficient SQL query

I have the following query, written inside perl script:
insert into #temp_table
select distinct bv.port,bv.sip,avg(bv.bv) bv, isnull(avg(bv.book_sum),0) book_sum,
avg(bv.book_tot) book_tot,
check_null = case when bv.book_sum = null then 0 else 1 end
from table_bv bv, table_group pge, table_master sm
where pge.a_p_g = '$val'
and pge.p_c = bv.port
and bv.r = '$r'
and bv.effective_date = '$date'
and sm.sip = bv.sip
query continued -- need help below (can some one help me make this efficient, or rewriting, I am thinking its wrong)
and ((sm.s_g = 'FE')OR(sm.s_g='CH')OR(sm.s_g='FX')
OR(sm.s_g='SH')OR(sm.s_g='FD')OR(sm.s_g='EY')
OR ((sm.s_t = 'TA' OR sm.s_t='ON')))
query continued below
group by bv.port,bv.sip
query ends
explanation: some $val that contain sip with
s_g ('FE','CH','FX','SH','FD','EY') and
s_t ('TA','ON') have book_sum as null. The temp_table does not take null values,
hence I am inserting them as zero ( isnull(avg(bv.book_sum),0) ) where ever it encounters a null for the following s_g and s_m ONLY.
I have tried making the query as follows but it made my script to stop wroking:
and sm.s_g in ('FE', 'CH','FX','SH','FD','EY')
or sm.s_t in ('TA','ON')`
I know this should be a comment, but I don't have the rep. To me, it looks like it's hanging because you lost your grouping at the end. I think it should be:
and (
sm.s_g in ('FE', 'CH','FX','SH','FD','EY')
or
sm.s_t in ('TA','ON')
)
Note the parentheses. Otherwise, you're asking for all of the earlier conditions, OR that sm.s_t is one of TA or ON, which is a much larger set than you're anticipating, which may cause it to spin.

How can I sanitize my DB from these duplicates

I have a table with the following fields:
id | domainname | domain_certificate_no | keyvalue
An example for the output of a select statement can be as:
'57092', '02a1fae.netsolstores.com', '02a1fae.netsolstores.com_1', '55525772666'
'57093', '02a1fae.netsolstores.com', '02a1fae.netsolstores.com_2', '22225554186'
'57094', '02a1fae.netsolstores.com', '02a1fae.netsolstores.com_3', '22444356259'
'97168', '02aa6aa.netsolstores.com', '02aa6aa.netsolstores.com_1', '55525772666'
'97169', '02aa6aa.netsolstores.com', '02aa6aa.netsolstores.com_2', '22225554186'
'97170', '02aa6aa.netsolstores.com', '02aa6aa.netsolstores.com_3', '22444356259’
I need to sanitize my db such that: I want to remove the domain names that have repeated keyvalue for the first domain_certificate_no (i.e, in this example, I look for the field domain_certificate_no: 02aa6aa.netsolstores.com_1, since it is number 1, and has repeated value for the key, then I want to remove the whole chain which is 02aa6aa.netsolstores.com_2 and 02aa6aa.netsolstores.com_3 and this by deleting the domain name that this chain belongs to which is 02aa6aa.netsolstores.com.
How can I automate the checking process for the whole DB. So, I have a query that checks any domain name in the pattern ('%.%.%) EDIT: AND they have share domain name (in this ex: netsolstores.com) , if it finds cert no. 1 that belongs to this domain name has a repeated key value, then delete. Otherwise no. Please, note tat, it is ok for domain_certificate_no to have repeated value if it is not number 1.
EDIT: I only compare the repeated valeues for the same second level domain name. Ex: in this question, I compare the values that share the domain name: .netsolstores.com. If I have another domain name, with sublevel domains, I do the same. But the point is that I don't need to compare the whole DB. Only the values with shared domain name (but different sub domain).
I'm not sure what happens with '02aa6aa.netsolstores.com_1' in your example.
The following keeps only the minimum id for any repeated key:
with t as (
select t.*,
substr(domain_certificate_no,
instr(domain_certificate_no, '_') + 1, 1000) as version,
left(domain_certificate_no, instr(domain_certificate_no, '_') - 1) as dcn
from t
)
select t.*
from t join
(select keyvalue, min(dcn) as mindcn
from t
group by keyvalue
) tsum
on t.keyvalue = tsum.keyvalue and
t.dcn = tsum.mindcn
For the data you provide, this seems to do the trick. This will not return the "_1" version of the repeats. If that is important, the query can be pretty easily modified.
Although I prefer to be more positive (thinking about the rows to keep rather than delete), the following should delete what you want:
with t as (
select t.*,
substr(domain_certificate_no,
instr(domain_certificate_no, '_') + 1, 1000) as version,
left(domain_certificate_no, instr(domain_certificate_no, '_') - 1) as dcn
from t
),
tokeep as (
select t.*
from t join
(select keyvalue, min(dcn) as mindcn
from t
group by keyvalue
) tsum
on t.keyvalue = tsum.keyvalue and
t.dcn = tsum.mindcn
)
delete from t
where t.id not in (select id from tokeep)
There are other ways to express this that are possibly more efficient (depending on the database). This, though, keeps the structure of the original query.
By the way, when trying new DELETE code, be sure that you stash a copy of the table. It is easy to make a mistake with DELETE (and UPDATE). For instance, if you leave out the WHERE clause, all the rows will disappear, after the long painful process of logging all of them. You might find it faster to simply select the desired results into a new table, validate them, then truncate the old table and re-insert them.