Selecting latest conversations from table containing private messages - mysql

I have a table containing personal messages from one user to another.
Here is the table structure:
mysql> describe pms;
+---------+--------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+---------+--------------+------+-----+---------+----------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| time | datetime | NO | | NULL | |
| from | int(11) | NO | | NULL | |
| from_ip | int(11) | NO | | NULL | |
| to | int(11) | NO | | NULL | |
| message | varchar(255) | NO | | NULL | |
| read | tinyint(4) | NO | | 0 | |
+---------+--------------+------+-----+---------+----------------+
I am creating a view which shows 10 latest conversations of a particular user id. As I want to find conversations, I thought of using GROUP BY from, to. This, however, returned duplicate rows (both from this user and to this user), and I also noticed that ordering does not work as it should.
In order to be able to properly order the results and thus select the 10 latest conversations, the groups should contain the latest row of the group instead of the first.
Here is the query I tried:
SELECT *
FROM `pms`
WHERE `from` = 1
OR `to` = 1
GROUP BY `from` , `to`
ORDER BY `id` DESC
LIMIT 10
Which gives the wrong row from the group, and therefore ordering by id (or time) gives a wrong order.
Any ideas how I could get it working?

This assumes that a conversation is defined by the from-to pair in either order, and that the latest conversation has the largest id:
SELECT least(`from`, `to`), greatest(`from`, `to`)
FROM `pms`
WHERE `from` = 1 OR `to` = 1
GROUP BY least(`from`, `to`), greatest(`from`, `to`)
ORDER BY max( `id`) DESC
LIMIT 10

Related

Calculate average of values between 2 columns sql

I have a table called validation_errors that looks like this:
+-------------+--------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-------------+--------------+------+-----+---------+----------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| link | varchar(200) | NO | MUL | NULL | |
| message | varchar(500) | NO | | | |
| explanation | mediumtext | NO | | NULL | |
| type | varchar(50) | NO | | | |
| subtype | varchar(50) | NO | | | |
| message_id | varchar(50) | NO | | | |
+-------------+--------------+------+-----+---------+----------------+
Link table looks like this:
+-----------+--------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-----------+--------------+------+-----+---------+-------+
| link | varchar(200) | NO | PRI | NULL | |
| visited | tinyint(1) | NO | | 0 | |
| validated | tinyint(1) | NO | | 0 | |
+-----------+--------------+------+-----+---------+-------+
I wish to calculate the average number of validation errors per page per topdomain.
I have a query that can fetch the amount of pages per topdomain:
SELECT substr(link, - instr(reverse(link), '.')) as domain , count(*) as count
FROM links
GROUP BY domain
ORDER BY count desc
limit 30;
And have a sql query that can fetch the amount of validation errors per top domain:
SELECT substr(link, - instr(reverse(link), '.')) as domain ,count(*) as count
FROM validation_errors
GROUP BY domain
ORDER BY count desc
limit 30;
What i now need to do is combine them into a query and divise the results of one column with the other and i can't figure out how to do it.
Any help would be greatly apriciated.
First, use substring_index(), rather than your construct. Here is the query to join them together:
select domain, sum(numviews) as numviews, sum(numerrors) as numerrors,
sum(numerrors) / nullif(sum(numviews), 0) as error_rate
from ((SELECT substring_index(link, '.', -1) as domain , count(*) as numviews, 0 as numerrors
FROM links
GROUP BY domain
) UNION ALL
(SELECT substring_index(link, '.', -1) as domain , 0, count(*)
FROM validation_errors
GROUP BY domain
)
) d
GROUP BY domain;
With both variables, I don't know which 30 you want to choose, so I haven't included an order by.
Note that this doesn't use a join, it uses union all with aggregation. This ensures that you will get all domains, even those with no views and those with no errors.

LIMIT showing duplicate results

I can't figure out why this is happening. I have a table with the following columns:
+-------------+------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-------------+------------+------+-----+---------+----------------+
| adid | int(11) | NO | PRI | NULL | auto_increment |
| price | float | YES | | NULL | |
| categoryid | int(11) | YES | | NULL | |
| visible | tinyint(4) | YES | MUL | NULL | |
+-------------+------------+------+-----+---------+----------------+
There are 7 records in this table that are visible and have category set as 3. I do a simple query like this:
SELECT adid FROM ads as a
WHERE categoryid = 3
and visible = 1
order by price desc
limit 0, 5
I get the following adid's returned: 1,4,3,15,7
On the next page the query is:
SELECT adid FROM ads as a
WHERE categoryid = 3
and visible = 1
order by price desc
limit 5, 5
I get: 11,15
Maybe I am up too late, but why do I get 15 twice?
For the results to be stable and consistent you need to have any unique column to participate in sorting.
In this case it might be
ORDER BY price DESC, adid

mysql average latest 5 rows

I have table:
describe tests;
+-----------+-----------+------+-----+-------------------+-----------------------------+
| Field | Type | Null | Key | Default | Extra |
+-----------+-----------+------+-----+-------------------+-----------------------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| line_id | int(11) | NO | | NULL | |
| test_time | timestamp | NO | | CURRENT_TIMESTAMP | on update CURRENT_TIMESTAMP |
| alarm_id | int(11) | YES | | NULL | |
| result | int(11) | NO | | NULL | |
+-----------+-----------+------+-----+-------------------+-----------------------------+
And I execute query:
SELECT avg(result) FROM tests WHERE line_id = 4 ORDER BY test_time LIMIT 5;
which I want to generate average of 5 latest results.
Still something is not ok, because query generates average of all table data.
What can be wrong?
If you want the last five rows, then you need to order by the time column in descending order:
select avg(result)
from (select result
from tests
where line_id = 4
order by test_time desc
limit 5
) t
the guy before submitted something link that
for my it works
select avg( id ) from ( select id from rand limit 5) as id;
Only one result set will be returned because of the AVG function.

MYSQL : Improving query perfomance on join with order by clause

I have two tables which contains the daily activities of a user . I have two join these tables and select top ten ids from this table .
Table 1 : buildlog
+----------------+------------------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+----------------+------------------------+------+-----+---------+----------------+
| NAME | varchar(50) | YES | | NULL | |
| ID | int(11) | NO | PRI | NULL | auto_increment |
| DATE_AND_TIME | datetime | YES | | NULL | |
| COMMENT | mediumtext | YES | | NULL | |
+----------------+------------------------+------+-----+---------+----------------+
Number Of Rows : 276186
Table 2 : reports
+---------------+--------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+---------------+--------------+------+-----+---------+----------------+
| r_id | int(10) | NO | PRI | NULL | auto_increment |
| id | int(15) | YES | UNI | NULL | |
| label | varchar(200) | YES | | NULL | |
+---------------+--------------+------+-----+---------+----------------+
Number Of Rows : 134058
If I am using only join query with this two tables using id it comes very quickly .
Query 1:
select buildlog.id,reports.label from buildlog join reports on reports.id = buildlog.id limit 10\G
Query Time : 10 rows in set (0.01 sec)
If I add order by to get latest ten build ids,label it takes 1 to 2 minutes to execute .
Query 2 :
select buildlog.id,reports.label from buildlog join reports on reports.id = buildlog.id order by buildlog.id desc limit 10\G
Query Time : 10 rows in set (0.98 sec)
order by column is an primary key buildlog.id . So, It's already indexed why It takes more time to execute this query ? . Can anyone suggest how can I optimize this?
SELECT * FROM (
SELECT
buildlog.id,
reports.label
FROM
buildlog
JOIN
reports
ON
reports.id = buildlog.id
) AS myval_new
ORDER BY id DESC limit 10
The slow down comes because it is probably choosing to do the ordering before doing the join. Doing the order by in an outer query forces it to only order the selected items.

How to code this SELECT statment?

Given this table :
mysql> describe activity;
+---------------------------+-------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+---------------------------+-------------+------+-----+---------+-------+
| user_id | varchar(16) | NO | | NULL | |
| login_time | int(11) | NO | | NULL | |
| last_activity_time | int(11) | NO | | NULL | |
| last_activity_description | text | YES | | NULL | |
| logout_time | int(11) | NO | | NULL | |
+---------------------------+-------------+------+-----+---------+-------+
5 rows in set (0.01 sec)
I want to select the most recent last_activity_time (standard Unix timestamp) for each user who is logged in (i.e has one or more rows where logout_time is not zer0).
I tried
SELECT user_id, login_time, MAX(last_activity_time)
FROM activity
WHERE logout_time="0";
...but that found only a single entry with two users logged in, probably because I am selecting for MAX(last_activity_time)
What I want is something like
SELECT all unique user_ids
SELECT each of those which has one or more entries where `logout_time` != 0
SELECT the maximum value of `logout_time` for each of those
all in one single SELECT statement. How can I do that?
SELECT user_id, MAX(logout_time)
FROM activity
WHERE logout_time <> "0"
GROUP BY user_id;