Here is the table I use:
+-------------+----------+---------------------+
| sourceindex | source | pa |
+-------------+----------+---------------------+
| 0 | this | 0.13842974556609988 |
| 1 | is | 0.26446279883384705 |
| 2 | a | 0.26446279883384705 |
| 3 | book | 0.13842974556609988 |
| 4 | , | 0.26446279883384705 |
| 5 | that | 0.13842974556609988 |
I want to add a column which will be the result log(sum(pa))/pa.
Any suggestions on how I could do that?
You can use a cross join to to calculate log(sum(pa)) and in your outer you can divide the result with each value of pa colum
update
test t
join (select
`sourceindex`, `source`, `pa` , log_sum/pa new_col
from
test
cross join (select log(sum(pa)) log_sum
from test ) a
) t1
on (t.sourceindex= t1.sourceindex
and t.source = t1.source
and t.pa = t1.pa
)
set t.new_col = t1.new_col
Demo
But its better if you switch your logic to show your calculation with select query
select `sourceindex`, `source`, `pa` , log_sum/pa new_col
from
test
cross join (select log(sum(pa)) log_sum
from test ) t
Demo
Related
I have two tables
tbl1 and tbl2
tbl1 table contains 5 columns name id(pk), email , address ,pid(INDEX),status(ENUM Y,N)
tbl2 table contains 3 columns id(pk) ,pid(INDEX),domain
When i am running this query
SELECT *
FROM tbl1 as l
LEFT JOIN tbl2 as m on l.pid=m.pid
WHERE l.status='Y';
It is giving multiple records . Please note we are making join in pid both pid are not primary key. Please help to get only unique values from both table.
enter image description here
You seem to want to join on the basis of relative position in the tables.A way to do this is row_number simulation using variables.
drop table if exists t1,t2;
create table t1(id int, email varchar(5),address varchar(10),pid int,status varchar(1));
create table t2(id int, pid int, domain varchar(5));
insert into t1 values (1,'aa#aa', 'aaaaa',428,'Y'), (2,'bb#bb', 'bbbbb',428,'n'),(3,'cc#cc', 'ccccc',428,'Y') ;
insert into t2 values (1,428,'mmm'),(2,428,'zzz');
select t1.*,t2.*
from
(
select t1.*,
if(t1.pid <> #pid1, #bn1:=#bn1+1,#bn1:=#bn1) BlockNo1,
if(t1.id <> #id1, #rn1:=#rn1+1, #rn1:=1) rowno1,
#pid1:=t1.pid pid1,
#id1:=t1.id p1
from t1
cross join (select #bn1:=0,#rn1:=0, #pid1:=0 ,#id1:=0) r
where status = 'y'
order by t1.pid,t1.id
) t1
join
(
select t2.id t2id,t2.pid t2pid, t2.domain t2domain,
if(t2.pid <> #pid2, #bn2:=#bn2+1,#bn2:=#bn2) BlockNo2,
if(t2.id <> #id2, #rn2:=#rn2+1, #rn2:=1) rowno2,
#pid2:=t2.pid pid2,
#id2:=t2.id p2
from t2
cross join (select #bn2:=0,#rn2:=0, #pid2:=0 ,#id2:=0) r
order by t2.pid,t2.id
) t2 on (t1.blockno1 = t2.blockno2) and (t1.rowno1 = t2.rowno2)
+------+-------+---------+------+--------+----------+--------+------+------+------+-------+----------+----------+--------+------+------+
| id | email | address | pid | status | BlockNo1 | rowno1 | pid1 | p1 | t2id | t2pid | t2domain | BlockNo2 | rowno2 | pid2 | p2 |
+------+-------+---------+------+--------+----------+--------+------+------+------+-------+----------+----------+--------+------+------+
| 1 | aa#aa | aaaaa | 428 | Y | 1 | 1 | 428 | 1 | 1 | 428 | mmm | 1 | 1 | 428 | 1 |
| 3 | cc#cc | ccccc | 428 | Y | 1 | 2 | 428 | 3 | 2 | 428 | zzz | 1 | 2 | 428 | 2 |
+------+-------+---------+------+--------+----------+--------+------+------+------+-------+----------+----------+--------+------+------+
2 rows in set (0.04 sec)
I have this query:
SELECT MIN(id),CustomerName, Scenario,StepNo,InTransit,IsAlef,runNo,ResponseLength
FROM `RequestInfo`
WHERE `CustomerName` = 'Hotstar'
AND `ResponseContentType` like '%video/MP2T%'
AND `RequestHttpRequest` like '%segment%' ;
which gives me output like this:-
+---------+--------------+----------+--------+-----------+--------+-------+----------------+----------+
| MIN(id) | CustomerName | Scenario | StepNo | InTransit | IsAlef | runNo | ResponseLength | IsActive |
+---------+--------------+----------+--------+-----------+--------+-------+----------------+----------+
| 139 | HotStar | SearchTv | 1 | No | No | 1 | 410098 | NULL |
+---------+--------------+----------+--------+-----------+--------+-------+----------------+----------+
I want to insert string "Yes" in the last column i.e "IsActive" when the above data is being displayed but only when the IsActive is set as NULL.
Use below query
Update RequestInfo R inner join (SELECT MIN(id) as id,CustomerName, Scenario,StepNo,InTransit,IsAlef,runNo,ResponseLength
FROM `RequestInfo`
WHERE `CustomerName` = 'Hotstar'
AND `ResponseContentType` like '%video/MP2T%'
AND `RequestHttpRequest` like '%segment%')as T on R.id = T.id set R.isAcitve ='Yes' Where R.id = T.id;
I have a couple of very large tables (over 400,000 rows) that look like the following:
+---------+--------+---------------+
| ID | M1 | M1_Percentile |
+---------+--------+---------------+
| 3684514 | 3.2997 | NULL |
| 3684515 | 3.0476 | NULL |
| 3684516 | 2.6499 | NULL |
| 3684517 | 0.3585 | NULL |
| 3684518 | 1.6919 | NULL |
| 3684519 | 2.8515 | NULL |
| 3684520 | 4.0728 | NULL |
| 3684521 | 4.0224 | NULL |
| 3684522 | 5.8207 | NULL |
| 3684523 | 6.8291 | NULL |
+---------+--------+---------------+...about 400,000 more
I need to assign each row in the M1_Percentile column a value that represents "the percent of rows with M1 values equal or lower to the current row's M1 value"
In other words, I need:
I implemented this sucessfully, but it is FAR FAR too slow. If anyone could create a more efficient version of the following code, I would really appreciate it!
UPDATE myTable AS X JOIN (
SELECT
s1.ID, COUNT(s2.ID)/ (SELECT COUNT(*) FROM myTable) * 100 AS percentile
FROM
myTable s1 JOIN myTable s2 on (s2.M1 <= s1.M1)
GROUP BY s1.ID
ORDER BY s1.ID) AS Z
ON (X.ID = Z.ID)
SET X.M1_Percentile = Z.percentile;
This is the (correct but slow) result from the above query if the number of rows is limited to the ones you see (10 rows):
+---------+--------+---------------+
| ID | M1 | M1_Percentile |
+---------+--------+---------------+
| 3684514 | 3.2997 | 60 |
| 3684515 | 3.0476 | 50 |
| 3684516 | 2.6499 | 30 |
| 3684517 | 0.3585 | 10 |
| 3684518 | 1.6919 | 20 |
| 3684519 | 2.8515 | 40 |
| 3684520 | 4.0728 | 80 |
| 3684521 | 4.0224 | 70 |
| 3684522 | 5.8207 | 90 |
| 3684523 | 6.8291 | 100 |
+---------+--------+---------------+
Producing the same results for the entire 400,000 rows takes magnitudes longer.
I cannot test this, but you could try something like:
update table t
set mi_percentile = (
select count(*)
from table t1
where M1 < t.M1 / (
select count(*)
from table));
UPDATE:
update test t
set m1_pc = (
(select count(*) from test t1 where t1.M1 < t.M1) * 100 /
( select count(*) from test));
This works in Oracle (the only database I have available). I do remember getting that error in MySQL. It is very annoying.
Fair warning: mysql isn't my native environment. However, after a little research, I think the following query should be workable:
UPDATE myTable AS X
JOIN (
SELECT X.ID, (
SELECT COUNT(*)
FROM myTable X1
WHERE (X.M1, X.id) >= (X1.M1, X1.id) as Rank)
FROM myTable as X
) AS RowRank
ON (X.ID = RowRank.ID)
CROSS JOIN (
SELECT COUNT(*) as TotalCount
FROM myTable
) AS TotalCount
SET X.M1_Percentile = RowRank.Rank / TotalCount.TotalCount;
I have a MYSQL Table with the following structure called daily_measurements
+------------+----------+------+-----+---------------------+----------------+
| Field | Type | Null | Key | Default | Extra |
+------------+----------+------+-----+---------------------+----------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| user_id | int(11) | NO | | 0 | |
| date | datetime | NO | MUL | 0000-00-00 00:00:00 | |
| weight | float | NO | | 0 | |
| bicep | float | NO | | 0 | |
| chest | float | NO | | 0 | |
| waist | float | NO | | 0 | |
| neck | float | NO | | 0 | |
| thigh | float | NO | | 0 | |
| hips | float | NO | | 0 | |
| shoulders | float | NO | | 0 | |
| knee | float | NO | | 0 | |
| ankle | float | NO | | 0 | |
| created_on | datetime | NO | | 0000-00-00 00:00:00 | |
+------------+----------+------+-----+---------------------+----------------+
I need to retrieve a list of every user's weight for there first and last entry.
I've tried various combinations of GROUP BY, MIN(date), MAX(date), etc. but I can't seem to figure out a way to do it efficiently.
The only way I've been able to get this to work is to do the following query on the users table, w/ 2 subqueries, but since there are aprox 30,000 users and > 200,000 measurements the query chokes up pretty bad.
SELECT u.id,
(SELECT user_id, weight, date FROM daily_measurements WHERE user_id = u.id ORDER BY date DESC limit 1) as starting_weight,
(SELECT user_id, weight, date FROM daily_measurements WHERE user_id = u.id ORDER BY date ASC limit 1) as ending_weight
FROM users u
Any help would be appreciated.
My solution:
SELECT
u1.user_id,
u2.first_entry_weight,
u1.weight AS last_entry_weight
FROM daily_measurements u1
INNER JOIN (SELECT
u1.user_id,
u1.weight AS first_entry_weight,
u2.fe,
u2.le
FROM daily_measurements u1
INNER JOIN (SELECT
daily_measurements.user_id,
MIN(date_entry) fe,
MAX(date_entry) le
FROM daily_measurements
GROUP BY daily_measurements.user_id) u2
ON u1.user_id = u2.user_id
AND u1.date_entry = u2.fe) u2
ON u1.user_id = u2.user_id
AND u1.date_entry = u2.le
can not test it and it's performance at the moment but I thing u can start from the following query:
SELECT
u.id,
SUBSTRING_INDEX( GROUP_CONCAT(CAST(d.weight AS CHAR) ORDER BY d.date ASC ), ',', 1 ) as starting_weight,
SUBSTRING_INDEX( GROUP_CONCAT(CAST(d.weight AS CHAR) ORDER BY d.date DESC), ',', 1 ) as ending_weight
FROM users as u
LEFT JOIN daily_measurements as d ON (u.id = d.user_id)
edit please treat this as a suggestion for your Query...
with such amount of users "JOIN" could be hundreds times faster then two SELECT sub-queries
SELECT A.user_id,
B.weight InitialWeight,
B.`date` InitialDate,
C.weight LatestWeight,
C.`date` LatestDate
FROM
(
SELECT user_id,MIN(id) idmin,MAX(id) idmax
FROM daily_measurements GROUP BY user_id
) A
INNER JOIN daily_measurements B ON (A.user_id=B.user_id AND A.idmin = B.id)
INNER JOIN daily_measurements C ON (A.user_id=C.user_id AND A.idmax = C.id);
Please make sure you have an index like this
ALTER TABLE daily_measurements ADD UNIQUE INDEX userid_id_ndx (user_id,id);
Try this:
select tb.* from daily_measurements tb
join (
select user_id, MIN(date) firstDate, MAX(date) lastDate
from daily_measurements
group by user_id
) temp
on tb.user_id = temp.user_id
and (tb.date = temp.firstDate or tb.date = temp.lastDate)
The subquery will identify first date and last date rows for each user_id, and main query will fetch the rows again to get all the data.
My table structure looks like this:
tbl.users tbl.issues
+--------+-----------+ +---------+------------+-----------+
| userid | real_name | | issueid | assignedid | creatorid |
+--------+-----------+ +---------+------------+-----------+
| 1 | test_1 | | 1 | 1 | 1 |
| 2 | test_2 | | 2 | 1 | 2 |
+--------+-----------+ +---------+------------+-----------+
Basically I want to write a query that will end in a results table looking like this:
(results table)
+---------+------------+---------------+-----------+--------------+
| issueid | assignedid | assigned_name | creatorid | creator_name |
+---------+------------+---------------+-----------+--------------+
| 1 | 1 | test_1 | 1 | test_1 |
| 2 | 1 | test_1 | 2 | test_2 |
+---------+------------+---------------+-----------+--------------+
My SQL looks like this at the moment:
SELECT
`issues`.`issueid`,
`issues`.`creatorid`,
`issues`.`assignedid`,
`users`.`real_name`
FROM `issues`
JOIN `users`
ON ( `users`.`userid` = `issues`.`creatorid` )
OR (`users`.`userid` = `issues`.`assignedid`)
ORDER BY `issueid` ASC
LIMIT 0 , 30
This returns something like this:
(results table)
+---------+------------+-----------+-----------+
| issueid | assignedid | creatorid | real_name |
+---------+------------+-----------+-----------+
| 1 | 1 | 1 | test_1 |
| 2 | 1 | 2 | test_1 |
| 2 | 1 | 2 | test_2 |
+---------+------------+-----------+-----------+
Can anyone help me get to the desired results table?
SELECT
IssueID,
AssignedID,
CreatorID,
AssignedUser.real_name AS AssignedName,
CreatorUser.real_name AS CreatorName
FROM Issues
LEFT JOIN Users AS AssignedUser
ON Issues.AssignedID = AssignedUser.UserID
LEFT JOIN Users AS CreatorUser
ON Issues.CreatorID = CreatorUser.UserID
ORDER BY `issueid` ASC
LIMIT 0, 30
On the general knowledge front, our illustrious site founder wrote a very nice blog article on this subject which I find myself referring to over and over again.
Visual Explanation of SQL Joins
Use this:
SELECT
`issues`.`issueid`,
`issues`.`creatorid`,
`creator`.`real_name`,
`issues`.`assignedid`,
`assigned`.`real_name`
FROM `issues` i
INNER JOIN `users` creator ON ( `creator`.`userid` = `issues`.`creatorid` )
INNER JOIN `users` assigned ON (`assigned`.`userid` = `issues`.`assignedid`)
ORDER BY `issueid` ASC
LIMIT 0 , 30
SELECT DISTINCT (i.issueid, i.creatorid, i.assignedid, u.real_name)
FROM issues i, users u
WHERE u.userid = i.creatorid OR u.userid = assignedid
ORDER BY i.issueid ASC
LIMIT 0 , 30
Not sure if the parenthesis are needed or not.
Does this work?
SELECT
i.issueid,
i.assignedid,
u1.real_name as assigned_name,
i.creatorid,
u2.real_name as creator_name
FROM users u1
INNER JOIN issues i ON u1.userid = i.assignedid
INNER JOIN users u2 ON u2.userid = i.creatorid
ORDER BY i.issueid
SELECT
i.issueid,
i.assignedid,
a.real_name,
i.creatorid,
c.real_name
FROM
issues i
INNER JOIN users c
ON c.userid = i.creatorid
INNER JOIN users a
ON a.userid = i.assignedid
ORDER BY
i.issueid ASC