MySQL adding the difference between time values to find the avg difference. - mysql

I have a column that is Time formatted it needs to be sorted newest to oldest. What I would like to do is find the differences in time between each adjoin record. The tricky part is I need to sum all of the time differences then divide by the count – 1 of all the time records. Can this be done in MySQL

Im sorry if i am being a bit too wordy, but i cant quite glean your level of mysql experience.
Also apologies if i dont understand your question. But here goes...
First of all, you dont need to sum and devide, MySQL has an average function for you called avg(). See here for details
http://dev.mysql.com/doc/refman/5.0/en/group-by-functions.html
What you want can be done by sub-queries i think. For more info on subqueries look here.
http://dev.mysql.com/doc/refman/5.0/en/select.html
Basically, you want to first create a table that sorts the column.
SELECT someid, time
FROM table
ORDER BY TIME
Use that in a subquery that joins the table with itself but with a shifted index (To get the time before and time after)
SELECT *
FROM table1 as t1 INNER JOIN table2 as t2 ON t1.someid = t2.someid+1
And use avg on that
SELECT avg(t2.time-t1.time)
GROUP BY t1.someid

Related

Query takes too long to run

I am running the below query to retrive the unique latest result based on a date field within a same table. But this query takes too much time when the table is growing. Any suggestion to improve this is welcome.
select
t2.*
from
(
select
(
select
id
from
ctc_pre_assets ti
where
ti.ctcassettag = t1.ctcassettag
order by
ti.createddate desc limit 1
) lid
from
(
select
distinct ctcassettag
from
ctc_pre_assets
) t1
) ro,
ctc_pre_assets t2
where
t2.id = ro.lid
order by
id
Our able may contain same row multiple times, but each row with different time stamp. My object is based on a single column for example assettag I want to retrieve single row for each assettag with latest timestamp.
It's simpler, and probably faster, to find the newest date for each ctcassettag and then join back to find the whole row that matches.
This does assume that no ctcassettag has multiple rows with the same createddate, in which case you can get back more than one row per ctcassettag.
SELECT
ctc_pre_assets.*
FROM
ctc_pre_assets
INNER JOIN
(
SELECT
ctcassettag,
MAX(createddate) AS createddate
FROM
ctc_pre_assets
GROUP BY
ctcassettag
)
newest
ON newest.ctcassettag = ctc_pre_assets.ctcassettag
AND newest.createddate = ctc_pre_assets.createddate
ORDER BY
ctc_pre_assets.id
EDIT: To deal with multiple rows with the same date.
You haven't actually said how to pick which row you want in the event that multiple rows are for the same ctcassettag on the same createddate. So, this solution just chooses the row with the lowest id from amongst those duplicates.
SELECT
ctc_pre_assets.*
FROM
ctc_pre_assets
WHERE
ctc_pre_assets.id
=
(
SELECT
lookup.id
FROM
ctc_pre_assets lookup
WHERE
lookup.ctcassettag = ctc_pre_assets.ctcassettag
ORDER BY
lookup.createddate DESC,
lookup.id ASC
LIMIT
1
)
This does still use a correlated sub-query, which is slower than a simple nested-sub-query (such as my first answer), but it does deal with the "duplicates".
You can change the rules on which row to pick by changing the ORDER BY in the correlated sub-query.
It's also very similar to your own query, but with one less join.
Nested queries are always known to take longer time than a conventional query since. Can you append 'explain' at the start of the query and put your results here? That will help us analyse the exact query/table which is taking longer to response.
Check if the table has indexes. Unindented tables are not advisable(until unless obviously required to be unindented) and are alarmingly slow in executing queries.
On the contrary, I think the best case is to avoid writing nested queries altogether. Bette, run each of the queries separately and then use the results(in array or list format) in the second query.
First some questions that you should at least ask yourself, but maybe also give us an answer to improve the accuracy of our responses:
Is your data normalized? If yes, maybe you should make an exception to avoid this brutal subquery problem
Are you using indexes? If yes, which ones, and are you using them to the fullest?
Some suggestions to improve the readability and maybe performance of the query:
- Use joins
- Use group by
- Use aggregators
Example (untested, so might not work, but should give an impression):
SELECT t2.*
FROM (
SELECT id
FROM ctc_pre_assets
GROUP BY ctcassettag
HAVING createddate = max(createddate)
ORDER BY ctcassettag DESC
) ro
INNER JOIN ctc_pre_assets t2 ON t2.id = ro.lid
ORDER BY id
Using normalization is great, but there are a few caveats where normalization causes more harm than good. This seems like a situation like this, but without your tables infront of me, I can't tell for sure.
Using distinct the way you are doing, I can't help but get the feeling you might not get all relevant results - maybe someone else can confirm or deny this?
It's not that subqueries are all bad, but they tend to create massive scaleability issues if written incorrectly. Make sure you use them the right way (google it?)
Indexes can potentially save you for a bunch of time - if you actually use them. It's not enough to set them up, you have to create queries that actually uses your indexes. Google this as well.

Write a query that will compare its results to the same results but with another date as reference

I have a query that compares the final balance of a month with the final balance of the same month but from the year before.
The query works just fine, the issue is when I want to check against more than 2 years before, a query was made by my predecessor but this query takes too much time to print the results, it just adds another query per year of what we want to see, so the higher the year, the larger the query.
Another predecessor created a pivot table to see the results to present his information, only showing up to 3 years before, the query itself is good but when we want to display the whole information due to all the joins and unions the query becomes inefficient time-wise.
The project has been recently passed on to me, I see the original(structure/backbone) query looks good in order to achieve the results of the months final balance compared to last years monthly final balance, but I wish to make a more dynamic report regardless of the year/month we're looking into, and not just entirely hard coded or with repetition of the same query over and over again. But I've literally hit a wall since I can't come up with any idea of how to make it work in a more dynamic way. I'm fairly new to reporting and data analysis and that's basically what's limiting my progress.
SELECT T2.[Segment_0]+'-'+T2.[Segment_1]+'-'+T2.[Segment_2] Cuenta,
T2.[AcctName], SUM(T0.[Debit]) Debito, SUM(T0.[Credit]) Credito,
SUM(T0.[Debit])-SUM(T0.[Credit]) Saldo
FROM [server].[DB1].[dbo].[JDT1] T0
INNER JOIN [server].[DB1].[dbo].[OJDT] T1
ON T1.[TransId] = T0.[TransId]
INNER JOIN [server].[DB1].[dbo].[oact] T2
ON T2.[AcctCode] = T0.[Account]
WHERE T0.[RefDate] >= '2007-12-31' AND T0.[RefDate] <= '2016-06-30'
GROUP BY T2.[Segment_0]+'-'+T2.[Segment_1]+'-'+T2.[Segment_2],T2.[AcctName]
I'm not looking for someone to do this for me, but for someone who can point me and guide through the best possible course of action to achieve this.
Here are some suggestions:
It isn't clear to me why you need [server].[DB1].[dbo].[OJDT] T1. Its data doesn't appear in the output and it isn't needed to join T0 to T2. If you can omit it, do so.
If you can't omit it because you need to exclude transactions from T0 that aren't in T1, use an EXISTS clause rather than joining it in.
Use a CTE to group the T0 records by Account, and then join the CTE to T2. That way T2 doesn't have to join to every record in T0, just the summarized result. You also don't need to group by your composite field and your account name, because if you do your grouping in the CTE, they won't be grouped.
Here's a sort of outline of what that would look like:
;
WITH Summed as (
SELECT Account
, SUM(Credito) as SumCredito
...
FROM [JDT1] T0
WHERE T0.[RefDate] >= ...
GROUP BY Account
)
SELECT (.. your composite segment field ..)
, AccountName
, SumCredito
FROM Summed T1
JOIN [oact] T2
ON T1.account = T2.acctcode
If you want dynamic dates, you will probably need to parameterize this and turn it into a stored proc if it isn't one already.
Push as much formatting (which includes pivoting already-grouped data from a list into a matrix) into the reporting tool as possible. Achieving dynamic pivoting is tricky in T-SQL but trivial in SSRS, to pick just one tool.
Remember, you can always dynamically set the column headers in your tool: you don't have to change the column names in your data.
Hope this helps.

How can I make my mysql getting one record per month query faster?

I have a big database with about 3 million records with records containing a time stamp.
Now I want to select one record per month and it works using this query:
SELECT timestamp, id, gas_used, kwh_used1, kwh_used2 FROM energy
GROUP BY MONTH(timestamp) ORDER BY timestamp ASC
It works but it is very slow.
I have indexes on id and on timestamp.
What can I do to make this query fast?
GROUP BY MONTH(timestamp) is forcing the engine to look at each record individually, aka a sequential scan, which obviously is very slow when you have 30M records.
A common solution is to add an indexed column with just the criterium you will want to select on. However, I highly suspect that you will actually want to select on Year-Month, if your db is not reset every year.
To avoid data corruption issues, it may be best to create an insert trigger that automatically fills that field. That way this extra column doesn't interfere with your business logic.
It is not a good practice to SELECT columns that don't appear in GROUP BY statement, unless they are handled with aggregating function such as MIN(), MAX(), SUM() etc.
In your query this applies to columns:
id, gas_used, kwh_used1, kwh_used2
You will not get the "earliest" (by timestamp) row for each month in this case.
More:
https://dev.mysql.com/doc/refman/5.7/en/group-by-handling.html

What is the fastest way to count total of rows in mySQL

I need to know what is the fastest way to count the rows of views according to each product. I tried to run the query by join it between two table. A 'product_db' and 'product_views'. But it took about a minute to complete a query.
Here is my code:
select *,
count(product_views.vwr_id) as product_viewer,
from product_db
inner join product_viewer on product_db.id=product_views.vwr_cid
where product_id='$pid' order by id desc
Where '$pid' is a product id.
This is my product_views table.
I need to include a column of viewers into my table. But it takes very long time to load. I either tried to count a separate query, but no luck. Please you guy suggest a more brilliant way.
Regards,
It sounds like your query is slow, not the counting. Two things you could try:
Make sure the product_id field has an index on it.
If the product_id is a numeric field, remove the single quotes around it. In other words change this where product_id='$pid' to this where product_id=$pid. MySQL could be doing a conversion on the product_id field to convert it to a string for the comparison and ignoring the index if it does exist.

only select the row if the field value is unique

I sort the rows on date. If I want to select every row that has a unique value in the last column, can I do this with sql?
So I would like to select the first row, second one, third one not, fourth one I do want to select, and so on.
What you want are not unique rows, but rather one per group. This can be done by taking the MIN(pk_artikel_Id) and GROUP BY fk_artikel_bron. This method uses an IN subquery to get the first pk_artikel_id and its associated fk_artikel_bron for each unique fk_artikel_bron and then uses that to get the remaining columns in the outer query.
SELECT * FROM tbl
WHERE pk_artikel_id IN
(SELECT MIN(pk_artikel_id) AS id FROM tbl GROUP BY fk_artikel_bron)
Although MySQL would permit you to add the rest of the columns in the SELECT list initially, avoiding the IN subquery, that isn't really portable to other RDBMS systems. This method is a little more generic.
It can also be done with a JOIN against the subquery, which may or may not be faster. Hard to say without benchmarking it.
SELECT *
FROM tbl
JOIN (
SELECT
fk_artikel_bron,
MIN(pk_artikel_id) AS id
FROM tbl
GROUP BY fk_artikel_bron) mins ON tbl.pk_artikel_id = mins.id
This is similar to Michael's answer, but does it with a self-join instead of a subquery. Try it out to see how it performs:
SELECT * from tbl t1
LEFT JOIN tbl t2
ON t2.fk_artikel_bron = t1.fk_artikel_bron
AND t2.pk_artikel_id < t1.pk_artikel_id
WHERE t2.pk_artikel_id IS NULL
If you have the right indexes, this type of join often out performs subqueries (since derived tables don't use indexes).
This non-standard, mysql-only trick will select the first row encountered for each value of pk_artikel_bron.
select *
...
group by pk_artikel_bron
Like it or not, this query produces the output asked for.
Edited
I seem to be getting hammered here, so here's the disclaimer:
This only works for mysql 5+
Although the mysql specification says the row returned using this technique is not predictable (ie you could get any row as the "first" encountered), in fact in all cases I've ever seen, you'll get the first row as per the order selected, so to get a predictable row that works in practice (but may not work in future releases but probably will), select from an ordered result:
select * from (
select *
...
order by pk_artikel_id) x
group by pk_artikel_bron