SUM of differences between selective rows in table - mysql

I have a table with call records. Each call has a 'state' CALLSTART and CALLEND, and each call has a unique 'callid'. Also for each record there is a unique autoincrement 'id'. Each row has a MySQL TIMESTAMP field.
In a previous question I asked for a way to calculate the total of seconds of phone calls. This came to this SQL:
SELECT SUM(TIME_TO_SEC(differences))
FROM
(
SELECT SEC_TO_TIME(TIMESTAMPDIFF(SECOND,MIN(timestamp),MAX(timestamp)))as differences
FROM table
GROUP BY callid
)x
Now I would like to know how to do this, only for callid's that also have a row with the state CONNECTED.
Screenshot of table: http://imgur.com/gmdeSaY

Use a having clause:
SELECT SUM(difference)
FROM (SELECT callid, TIMESTAMPDIFF(SECOND, MIN(timestamp), MAX(timestamp)) as difference
FROM table
GROUP BY callid
HAVING SUM(state = 'Connected') > 0
) c;
If you only want the difference in seconds, I simplified the calculation a bit.
EDIT: (for Mihai)
If you put in:
HAVING state in ('Connected')
Then the value of state comes from an arbitrary row for each callid. Not all the rows, just an arbitrary one. You might or might not get lucky. As a general rule, avoid using the MySQL extension that allows "bare" columns in the select and having clauses, unless you really use the feature intentionally and carefully.

Related

IF function condition Subqueries all executed or conditioned only?

i have a query like this where i have over 1000 topics in table
SELECT
IF ( (SELECT COUNT(*) FROM topics) > 1000,
(SELECT MAX(id) FROM topics),
(SELECT MIN(id) FROM topics)
) AS MMID
what i think is COUNT(*) runs first then MAX(id) runs after it
but i do not know if is MIN(id) is calculated too and has a cost on the performance
and does the same apply to OR conditions in WHERE too?
It's mostly irrelevant. The COUNT(*) needs to scan through the table to get the count. But MIN and MAX are each trivial -- find the first or last entry in the index. (I am assuming you have PRIMARY KEY(id).)
If you are likely to have a table that is much bigger than 1000 rows, this should run faster: Change
(SELECT COUNT(*) FROM topics) > 1000
to
( EXISTS ( SELECT 1 FROM topics LIMIT 1000,1 ) )
That should quit after scanning 1000 (or maybe 1001?) rows, returning essentially true/false.
OR is a different matter.
SELECT ...
WHERE ...
OR ...
essentially cannot use any index. Instead, it must scan the entire table checking both expressions. It will short-circuit the query, but it is unclear which side of the OR will be checked first. I would hope (without any evidence) that it would decide that one side of the OR is clearly 'faster' and do it first (in hopes of getting TRUE).
WHERE ... AND ... does have specific known short circuits: If one side is MATCH..., that will be performed first.

Two(with subquery) or one query to select max(date) in where clause. MySQL

I need to create a table and store there cached status of some events. So I will have to do only two operations:
1) Insert id of event, it's status, and time of when this record was stored in db;
2) Get last record with certain event id.
There are several methods to get the result (status):
Method 1:
SELECT status FROM status_log a
WHERE a.event_id = 1
ORDER BY a.update_date DESC
LIMIT 1
Method 2:
SELECT status FROM status_log a
WHERE a.update_date = (
SELECT max(b.update_date) FROM status_log b
WHERE b.event_id = 1
) AND a.event_id = 1
So I have two questions:
Which query to use
Which field type to set to update_date field (int or timestamp)
Actually, your second query does not resolve question 'find record with greatest date of update for event #1' - because there could be many different events with same latest update_date. So, in terms of semantics - you should use first query. (after your edit this is fixed)
First query will be effective if you'll create an index by event_id index and this column will have good cardinality (i.e. WHERE clause will filter few enough rows by using that index). However, this can be improved by adding column update_date to index - but that makes sense only if there will be many rows with same event_id (many enough for MySQL to use second index part) - and again with good cardinality inside first index part.
But in practice - my advice is just a theory, you'll have to figure it out with EXPLAIN syntax and your own measures on real data.
As for data type - common practice is to use proper data type (i.e. datetime/timestamp for something which means time point)
Which query to use
I believe the first one should be faster. Anyway just run an EXPLAIN on them and you'll find out yourself.
The index you should be using will be:
ALERT TABLE status_log ADD INDEX(event_id, update_date)
Now... did you notice that those queries are NOT equivalent? The second one will return all status from all event_id that have a maximum date.
Which field type to set to update_date field (int or timestamp)
If you have a field named update_date I just can't imagine why an int would serve the same purpose. Rephrasing the question to choose between datetime or timestamp, then the answer is up to the requirements. If you just want to know when a record in the DB was updated use a timestamp. If the update_date refers to an entity in your domain model go for a datetime. You will most likely need to perform calculations on the date (add time, remove time, extract a month, etc) so using a unix timestamp (which I'd say should be almost write-only) will result in extra calculation time because you'll have to convert the timestamp to a datetime and then perform the function over that result.

get last record in file

I have a table (rather ugly designed, but anyway), which consists only of strings. The worst is that there is a script which adds records time at time. Records will never be deleted.
I believe, that MySQL store records in a random access file, and I can get last or any other record using C language or something, since I know the max length of the record and I can find EOF.
When I do something like "SELECT * FROM table" in MySQL I get all the records in the right order - cause MySQL reads this file from the beginning to the end. I need only the last one(s).
Is there a way to get the LAST record (or records) using MySQL query only, without ORDER BY?
Well, I suppose I've found a solution here, so my current query is
SELECT
#i:=#i+1 AS iterator,
t.*
FROM
table t,
(SELECT #i:=0) i
ORDER BY
iterator DESC
LIMIT 5
If there's a better solution, please let me know!
The order is not guaranteed unless you use an ORDER BY. It just happens that the records you're getting back are sorted the way need them.
Here is the importance of keys (primary key for example).
You can make some modification in your table by adding a primary key column with auto_increment default value.
Then you can query
select * from your_table where id =(select max(id) from your_table);
and get the last inserted row.

only select the row if the field value is unique

I sort the rows on date. If I want to select every row that has a unique value in the last column, can I do this with sql?
So I would like to select the first row, second one, third one not, fourth one I do want to select, and so on.
What you want are not unique rows, but rather one per group. This can be done by taking the MIN(pk_artikel_Id) and GROUP BY fk_artikel_bron. This method uses an IN subquery to get the first pk_artikel_id and its associated fk_artikel_bron for each unique fk_artikel_bron and then uses that to get the remaining columns in the outer query.
SELECT * FROM tbl
WHERE pk_artikel_id IN
(SELECT MIN(pk_artikel_id) AS id FROM tbl GROUP BY fk_artikel_bron)
Although MySQL would permit you to add the rest of the columns in the SELECT list initially, avoiding the IN subquery, that isn't really portable to other RDBMS systems. This method is a little more generic.
It can also be done with a JOIN against the subquery, which may or may not be faster. Hard to say without benchmarking it.
SELECT *
FROM tbl
JOIN (
SELECT
fk_artikel_bron,
MIN(pk_artikel_id) AS id
FROM tbl
GROUP BY fk_artikel_bron) mins ON tbl.pk_artikel_id = mins.id
This is similar to Michael's answer, but does it with a self-join instead of a subquery. Try it out to see how it performs:
SELECT * from tbl t1
LEFT JOIN tbl t2
ON t2.fk_artikel_bron = t1.fk_artikel_bron
AND t2.pk_artikel_id < t1.pk_artikel_id
WHERE t2.pk_artikel_id IS NULL
If you have the right indexes, this type of join often out performs subqueries (since derived tables don't use indexes).
This non-standard, mysql-only trick will select the first row encountered for each value of pk_artikel_bron.
select *
...
group by pk_artikel_bron
Like it or not, this query produces the output asked for.
Edited
I seem to be getting hammered here, so here's the disclaimer:
This only works for mysql 5+
Although the mysql specification says the row returned using this technique is not predictable (ie you could get any row as the "first" encountered), in fact in all cases I've ever seen, you'll get the first row as per the order selected, so to get a predictable row that works in practice (but may not work in future releases but probably will), select from an ordered result:
select * from (
select *
...
order by pk_artikel_id) x
group by pk_artikel_bron

What does it mean by select 1 from table?

I have seen many queries with something as follows.
Select 1
From table
What does this 1 mean, how will it be executed and, what will it return?
Also, in what type of scenarios, can this be used?
select 1 from table will return the constant 1 for every row of the table. It's useful when you want to cheaply determine if record matches your where clause and/or join.
SELECT 1 FROM TABLE_NAME means, "Return 1 from the table". It is pretty unremarkable on its own, so normally it will be used with WHERE and often EXISTS (as #gbn notes, this is not necessarily best practice, it is, however, common enough to be noted, even if it isn't really meaningful (that said, I will use it because others use it and it is "more obvious" immediately. Of course, that might be a viscous chicken vs. egg issue, but I don't generally dwell)).
SELECT * FROM TABLE1 T1 WHERE EXISTS (
SELECT 1 FROM TABLE2 T2 WHERE T1.ID= T2.ID
);
Basically, the above will return everything from table 1 which has a corresponding ID from table 2. (This is a contrived example, obviously, but I believe it conveys the idea. Personally, I would probably do the above as SELECT * FROM TABLE1 T1 WHERE ID IN (SELECT ID FROM TABLE2); as I view that as FAR more explicit to the reader unless there were a circumstantially compelling reason not to).
EDIT
There actually is one case which I forgot about until just now. In the case where you are trying to determine existence of a value in the database from an outside language, sometimes SELECT 1 FROM TABLE_NAME will be used. This does not offer significant benefit over selecting an individual column, but, depending on implementation, it may offer substantial gains over doing a SELECT *, simply because it is often the case that the more columns that the DB returns to a language, the larger the data structure, which in turn mean that more time will be taken.
If you mean something like
SELECT * FROM AnotherTable
WHERE EXISTS (SELECT 1 FROM table WHERE...)
then it's a myth that the 1 is better than
SELECT * FROM AnotherTable
WHERE EXISTS (SELECT * FROM table WHERE...)
The 1 or * in the EXISTS is ignored and you can write this as per Page 191 of the ANSI SQL 1992 Standard:
SELECT * FROM AnotherTable
WHERE EXISTS (SELECT 1/0 FROM table WHERE...)
it does what it says - it will always return the integer 1. It's used to check whether a record matching your where clause exists.
select 1 from table is used by some databases as a query to test a connection to see if it's alive, often used when retrieving or returning a connection to / from a connection pool.
The result is 1 for every record in the table.
To be slightly more specific, you would use this to do
SELECT 1 FROM MyUserTable WHERE user_id = 33487
instead of doing
SELECT * FROM MyUserTable WHERE user_id = 33487
because you don't care about looking at the results. Asking for the number 1 is very easy for the database (since it doesn't have to do any look-ups).
Although it is not widely known, a query can have a HAVING clause without a GROUP BY clause.
In such circumstances, the HAVING clause is applied to the entire set. Clearly, the SELECT clause cannot refer to any column, otherwise you would (correct) get the error, "Column is invalid in select because it is not contained in the GROUP BY" etc.
Therefore, a literal value must be used (because SQL doesn't allow a resultset with zero columns -- why?!) and the literal value 1 (INTEGER) is commonly used: if the HAVING clause evaluates TRUE then the resultset will be one row with one column showing the value 1, otherwise you get the empty set.
Example: to find whether a column has more than one distinct value:
SELECT 1
FROM tableA
HAVING MIN(colA) < MAX(colA);
If you don't know there exist any data in your table or not, you can use following query:
SELECT cons_value FROM table_name;
For an Example:
SELECT 1 FROM employee;
It will return a column which contains the total number of rows & all rows have the same constant value 1 (for this time it returns 1 for all rows);
If there is no row in your table it will return nothing.
So, we use this SQL query to know if there is any data in the table & the number of rows indicates how many rows exist in this table.
If you just want to check a true or false based on the WHERE clause, select 1 from table where condition is the cheapest way.
This means that You want a value "1" as output or Most of the time used as Inner Queries because for some reason you want to calculate the outer queries based on the result of inner queries.. not all the time you use 1 but you have some specific values...
This will statically gives you output as value 1.
I see it is always used in SQL injection,such as:
www.urlxxxxx.com/xxxx.asp?id=99 union select 1,2,3,4,5,6,7,8,9 from database;
These numbers can be used to guess where the database exists and guess the column name of the database you specified.And the values of the tables.
it simple means that you are retrieving the number first column from table ,,,,means
select Emply_num,Empl_no From Employees ;
here you are using select 1 from Employees;
that means you are retrieving the Emply_num column.
Thanks
The reason is another one, at least for MySQL. This is from the MySQL manual
InnoDB computes index cardinality values for a table the first time that table is accessed after startup, instead of storing such values in the table. This step can take significant time on systems that partition the data into many tables. Since this overhead only applies to the initial table open operation, to “warm up” a table for later use, access it immediately after startup by issuing a statement such as SELECT 1 FROM tbl_name LIMIT 1
This is just used for convenience with IF EXISTS(). Otherwise you can go with
select * from [table_name]
Image In the case of 'IF EXISTS', we just need know that any row with specified condition exists or not doesn't matter what is content of row.
select 1 from Users
above example code, returns no. of rows equals to no. of users with 1 in single column