Example Data:
╔════╦════════════╦════════════╦═══════╦═══════════╦════════╗
║ ID ║ START ║ STOP ║ USER ║ FILE ║ SIZE ║
╠════╬════════════╬════════════╬═══════╬═══════════╬════════╣
║ 1 ║ 1330133409 ║ 1330133410 ║ user1 ║ file1.zip ║ 300000 ║
║ 2 ║ 1330133409 ║ 1330133410 ║ user1 ║ file2.zip ║ 300500 ║
║ 3 ║ 1330133409 ║ 1330133410 ║ user2 ║ file1.zip ║ 300000 ║
║ 4 ║ 1330133409 ║ 1330133410 ║ user2 ║ file2.zip ║ 300500 ║
║ 5 ║ 1330133409 ║ 1330133410 ║ user1 ║ file3.zip ║ 500000 ║
║ 6 ║ 1330133409 ║ 1330133310 ║ user6 ║ file3.zip ║ 500000 ║
╚════╩════════════╩════════════╩═══════╩═══════════╩════════╝
I need to create a MySQL query that computes PER_USER_AVERAGE_BANDWIDTH where PER_USER_AVERAGE_BANDWIDTH = SUM(SIZE) / (STOP - START), and then order by PER_USER_AVERAGE_BANDWIDTH to produce results like this:
╔═══════╦════════════════════════════╗
║ USER ║ PER_USER_AVERAGE_BANDWIDTH ║
╠═══════╬════════════════════════════╣
║ user3 ║ 110.37 ║
║ user1 ║ 100.25 ║
║ user2 ║ 75.70 ║
╚═══════╩════════════════════════════╝
Clear as mud ;) Anyone?
I think your average should be total size over total duration, grouped by user:
SELECT USER,
SUM(SIZE) / SUM(STOP - START) AS PER_USER_AVERAGE_BANDWIDTH
FROM my_table
GROUP BY USER
ORDER BY PER_USER_AVERAGE_BANDWIDTH DESC
See it on sqlfiddle.
straight forward for average
SELECT
`user`,
AVG( size / ( stop - start ) ) per_user_average_bandwidth
FROM
tab_dl
GROUP BY `user`
ORDER BY per_user_average_bandwidth DESC
SQL Fiddle DEMO
This query should do it:
SELECT USER, (SUM(SIZE) / (STOP - START)) AS PER_USER_AVERAGE_BANDWIDTH
FROM table
GROUP BY USER, stop, start
ORDER BY PER_USER_AVERAGE_BANDWIDTH DESC
This will give you the average bandwidth per user per unique time frame (i.e. you will get 2 rows for a user if they download file 1 and file 2 between time 1 and time 5 and file 3 between time 1 and time 10).
Related
I have the following data:
╔════╦═══════╦═══════╗
║ id ║ group ║ place ║
╠════╬═══════╬═══════╣
║ 1 ║ 1 ║ a ║
║ 2 ║ 1 ║ b ║
║ 3 ║ 1 ║ b ║
║ 4 ║ 1 ║ a ║
║ 5 ║ 1 ║ c ║
║ 6 ║ 2 ║ a ║
║ 7 ║ 2 ║ b ║
║ 8 ║ 2 ║ c ║
╚════╩═══════╩═══════╝
How can I get the path of each group in MySQL?
The expected result is:
╔═══════╦════════════╗
║ group ║ path ║
╠═══════╬════════════╣
║ 1 ║ a-b-a-c ║
║ 2 ║ a-b-c ║
╚═══════╩════════════╝
Assuming that the end goal is to sort by group and id, and then simplify each group's sequence so that consecutive repeated places are only shown once:
Start by determining, for each row, whether the place or the group have changed since the previous row. There's a good solution to this problem in this answer.
Then use GROUP_CONCAT to merge the places together into a path.
Be aware that GROUP_CONCAT has a user-configurable maximum length, which by default is 1,024 characters.
SELECT
`group`,
GROUP_CONCAT(place ORDER BY id SEPARATOR '-') path
FROM
(SELECT
COALESCE(#place != place OR #group != `group`, 1) changed,
id,
#group:=`group` `group`,
#place:=place place
FROM
place_table, (SELECT #place:=NULL, #group:=NULL) s
ORDER BY `group`, id) t
WHERE
changed = 1
GROUP BY `group`;
Here is the mySQL table data:
╔════╦══════╦══════════╦══════════════╗
║ ID ║ USER ║ DATE ║ NUMDOWNLOADS ║
╠════╬══════╬══════════╬══════════════╣
║ 1 ║ John ║ xx-xx-xx ║ 1 ║
║ 2 ║ Mary ║ xx-xx-xx ║ 3 ║
║ 3 ║ John ║ xx-xx-xx ║ 5 ║
║ 4 ║ Mary ║ xx-xx-xx ║ 2 ║
║ 5 ║ Mary ║ xx-xx-xx ║ 6 ║
║ 6 ║ John ║ xx-xx-xx ║ 7 ║
║ 7 ║ John ║ xx-xx-xx ║ 1 ║
║ 8 ║ Mary ║ xx-xx-xx ║ 8 ║
║ 9 ║ Mary ║ xx-xx-xx ║ 9 ║
╚════╩══════╩══════════╩══════════════╝
What I want to accomplish is to group the data by USER, and display the total NUMDOWNLOADS per USER where NUMDOWNLOADS is > X. For example, if X=5:
John: 1 (since 1 NUMDOWNLOADS > 5, and others count collectively as 1)
Mary: 3 (since 3 NUMDOWNLOADS > 5, and others count collectively as 1)
So, (1) output per user, and (2) output total, which in this case would be 4. Clear as mud :) Ideas on statement to use?
Your query is here. Try it
SELECT USER, COUNT(NUMDOWNLOADS)
FROM table_name
WHERE NUMDOWNLOADS > 5
GROUP BY USER
SELECT USER, COUNT(NUMDOWNLOADS)
FROM downloads
WHERE NUMDOWNLOADS > 5
GROUP BY USER
Follow the link below for a running demo:
SQLFiddle
I think you just want to count records where NUMDOWNLOADS > 5:
select USER, count(*)
from myTable
where NUMDOWNLOADS > 5
group by USER
The WHERE filter is performed before any grouping is done, so first this query filters out any rows that do not match NUMDOWNLOADS > 5, then it groups by USER and counts.
Alternatively if there is something about your actual query that requires you to use a conditional sum, you can do so as well:
select USER, sum(case when NUMDOWNLOADS > 5 then 1 else 0 end)
from myTable
group by USER
I've inherited a database that includes a lookup table to find other patents that are related to a given patent.
So it looks like
╔════╦═══════════╦════════════╗
║ id ║ patent_id ║ related_id ║
╠════╬═══════════╬════════════╣
║ 1 ║ 1 ║ 2 ║
║ 2 ║ 1 ║ 3 ║
║ 3 ║ 2 ║ 1 ║
║ 4 ║ 2 ║ 3 ║
║ 5 ║ 3 ║ 2 ║
╚════╩═══════════╩════════════╝
And I want to filter out the reciprocal relationships. 1->2 and 2->1 are the same for my purposes so I only want 1->2.
I don't need to make the edit in the table, I just need a query the returns a list of the unique relationships, and while I'm sure it's simple I've been banging my head against the keyboard for far too long.
Here is a clever query which you can try using. The general strategy is to identify the unwanted duplicate records and then subtract them away from the entire set.
SELECT t.id, t.patent_id, t.related_id
FROM t LEFT JOIN
(
SELECT t1.patent_id AS t1_patent_id, t1.related_id AS t1_related_id
FROM t t1 LEFT JOIN t t2
ON t1.related_id = t2.patent_id
WHERE t1.patent_id = t2.related_id AND t1.patent_id > t1.related_id
) t3
ON t.patent_id = t3.t1_patent_id AND t.related_id = t3.t1_related_id
WHERE t3.t1_patent_id IS NULL
Here is the inner temporary table generated by this query. You can convince yourself that by applying the logic in the WHERE clause you will select the correct records. Non-duplicate records are characterized by t1.patent_id != t2.related_id, and all these records are retained. In the case of duplicates (t1.patent_id = t2.related_id), the record chosen from each pair of duplicates is the one where patent_id < related_id, as you requested in your question.
╔════╦══════════════╦═══════════════╦══════════════╦═══════════════╗
║ id ║ t1.patent_id ║ t1.related_id ║ t2.patent_id ║ t2.related_id ║
╠════╬══════════════╬═══════════════╬══════════════╬═══════════════╣
║ 1 ║ 1 ║ 2 ║ 2 ║ 1 ║ * duplicate
║ 1 ║ 1 ║ 2 ║ 2 ║ 3 ║
║ 2 ║ 1 ║ 3 ║ 3 ║ 2 ║
║ 3 ║ 2 ║ 1 ║ 1 ║ 2 ║ * duplicate
║ 3 ║ 2 ║ 1 ║ 1 ║ 3 ║
║ 4 ║ 2 ║ 3 ║ 3 ║ 2 ║ * duplicate
║ 5 ║ 3 ║ 2 ║ 2 ║ 1 ║
║ 5 ║ 3 ║ 2 ║ 2 ║ 3 ║ * duplicate
╚════╩══════════════╩═══════════════╩══════════════╩═══════════════╝
Click the link below for a running example of this query.
SQLFiddle
Try something like
select distinct * from
(select patient_id, related_id from TABLENAME
union
select related_id, patient_id from TABLENAME
);
Okay you're right the above won't work. Try
select patient_id, related_id from TABLENAME p1
where p1.patiend_id not in
(select patient_id from TABLENAME p2
where p2.related_id = p1.related_id)
I need to clean some data by merging two similar but slightly different dimension field values into one new row that adds together the two metric values, keeping the uid and date intact.
Current setup looks like this:
╔═════╦═════════════╦══════╦═══════════╦═══════════╗
║ id ║ date ║ uid ║ source ║ pageviews ║
╠═════╬═════════════╬══════╬═══════════╬═══════════╣
║ 1 ║ 2013-12-11 ║ 111 ║ source1 ║ 14 ║
║ 3 ║ 2013-12-11 ║ 111 ║ source1a ║ 1 ║
║ 11 ║ 2013-12-11 ║ 222 ║ source1 ║ 3 ║
║ 19 ║ 2013-12-11 ║ 222 ║ source1a ║ 11 ║
╚═════╩═════════════╩══════╩═══════════╩═══════════╝
I'd like to consider source1 and source1a to be equal and merge the two, to get this:
╔═════╦═════════════╦══════╦══════════╦═══════════╗
║ id ║ date ║ uid ║ source ║ pageviews ║
╠═════╬═════════════╬══════╬══════════╬═══════════╣
║ 1 ║ 2013-12-11 ║ 111 ║ source1 ║ 15 ║
║ 2 ║ 2013-12-11 ║ 222 ║ source1 ║ 14 ║
╚═════╩═════════════╩══════╩══════════╩═══════════╝
id is not important, I had planned to re-increment the id in the new table that results
This is what I tried, but it didn't merge the two records – I am getting matching values but still separate rows:
SELECT date, uid, (SELECT CASE
WHEN source = 'source1a' THEN 'source1'
ELSE source
END) AS 'source', pageviews
FROM trafficSourceMedium
GROUP BY date, source, userid
An aggregation query should do what you want:
select `date`, uid,
(case when source = 'source1a' then 'source1' else source end) as source,
sum(pageviews) as pageviews
from trafficSourceMedium
group by `date`, uid,
(case when source = 'source1a' then 'source1' else source end);
╔════════╦═══════════╦═══════╗
║ MSG_ID ║ RANDOM_ID ║ MSG ║
╠════════╬═══════════╬═══════╣
║ 1 ║ 22 ║ apple ║
║ 2 ║ 22 ║ bag ║
║ 3 ║ 0 ║ cat ║
║ 4 ║ 0 ║ dog ║
║ 5 ║ 0 ║ egg ║
║ 6 ║ 21 ║ fish ║
║ 7 ║ 21 ║ hen ║
║ 8 ║ 20 ║ glass ║
╚════════╩═══════════╩═══════╝
Want to fetch 3 records in a lot such a way that all the data of a particular random_id is picked up .
Result Required:
║ MSG_ID ║ RANDOM_ID ║ MSG ║
╠════════╬═══════════╬═══════╣
║ 1 ║ 22 ║ apple ║
║ 2 ║ 22 ║ bag ║
║ 3 ║ 0 ║ cat ║
Current Result:
║ MSG_ID ║ RANDOM_ID ║ MSG ║
╠════════╬═══════════╬═══════╣
║ 1 ║ 22 ║ apple ║
║ 3 ║ 0 ║ cat ║
║ 4 ║ 0 ║ dog ║
______________________________
Query Used:
SELECT ID,Random_ID, GROUP_CONCAT(message SEPARATOR ' ' ),FLAG,mobile,sender_number,SMStype
FROM messagemaster
WHERE Random_ID > 0
GROUP BY Random_ID
UNION
SELECT ID,Random_ID, message,FLAG,mobile,sender_number,SMStype
FROM messagemaster
WHERE Random_ID = 0
order by random_id LIMIT 100;
I don't want to pick up records using group by.I want to fetch all the records w rt random_ids .Like , if there is a random_id for which there are 3 records and if the query has limit =3 , then i want all the data w r t those random_id to be picked up.
The situation is if I fetch rows with limit 100 , i dont want that some of the data with the random id present in the result set is not picked.
For example if i am picking records limit by 3 then for random id=22 , all records with random id =22 should be picked .
Consider the following...
SELECT b.*
FROM
( SELECT x.*, SUM(y.cnt)
FROM
( SELECT random_id,COUNT(*) cnt FROM messagemaster GROUP BY random_id) x
JOIN
( SELECT random_id,COUNT(*) cnt FROM messagemaster GROUP BY random_id) y
ON y.random_id >= x.random_id
GROUP
BY x.random_id
HAVING SUM(y.cnt) < 4
) a
JOIN messagemaster b
ON b.random_id = a.random_id;