I know it is not possible directly.
But I want to achieve this by any indirect method if possible.
Actually I wanted to add below query to view which throws error , Sub query not allowed in view.
select T1.Code,
T1.month,
T1.value,
IfNull(T2.Value,0)+IfNull(T3.value,0) as value_begin
from (select *,#rownum := #rownum + 1 as rownum
from Table1
Join (SELECT #rownum := 0) r) T1
left join (select *,#rownum1 := #rownum1 + 1 as rownum
from Table1
Join (SELECT #rownum1 := 0) r) T2
on T1.code = T2.code
and T1.rownum = T2.rownum + 1
left join (select *,#rownum2 := #rownum2 + 1 as rownum
from Table1
Join (SELECT #rownum2 := 0) r) T3
on T1.code = T3.code
and T1.rownum = T3.rownum + 2
Order by T1.Code,T1.rownum
So, I thought I will make Sub query as separate view but that again throws error that variables not allowed in view. Please Help to overcome this situation.
Thanx in advance
You could try the method of triangle join + count for assigning row numbers. It will likely not perform well on large datasets, but instead you should be able to implement everything with a couple of views (if you think there's no other way to do what you want to do than with a view). The idea is as follows:
The dataset is joined to itself on the condition of master.key >= secondary.key, where master is the instance where detail data will actually be pulled from, and secondary is the other instance of the same table used to provide the row numbers.
Based on that condition, the first* master row would be joined with one secondary row, the second one with two, the third one with three and so on.
At this point, you can group the result set by the master key column(s) as well as the columns that you need in the output (although in MySQL it would be enough to group by the master key only). Count the rows in every group will give you corresponding row numbers.
So, if there was a table like this:
CREATE TABLE SomeTable (
ID int,
Value int
);
the query to assign row numbers to the table could look like this:
SELECT m.ID, m.Value, COUNT(*) AS rownum
FROM SomeTable AS m
INNER JOIN SomeTable AS s ON m.ID >= s.ID
GROUP BY m.ID, m.Value
;
Since you appear to want to self-join the ranked rowset (and twice too), that would require using the above query as a derived table, and since you also want the entire thing to be a view (which doesn't allow subqueries in the FROM clause), you would probably need to define the ranking query as a separate view:
CREATE RankingView AS
SELECT m.ID, m.Value, COUNT(*) AS rownum
FROM SomeTable AS m
INNER JOIN SomeTable AS s ON m.ID >= s.ID
GROUP BY m.ID, m.Value
;
and subsequently refer to that view in the main query:
CREATE SomeOtherView AS
SELECT ...
FROM RankingView AS t1
LEFT JOIN RankingView AS t2 ON ...
...
This SQL Fiddle demo shows the method and its usage.
One note with regard to your particular situation. Your table probably needs row numbers to be assigned in partitions, i.e. every distinct Code row group needs its own row number set. That means that your ranking view should specify the joining condition as something like this:
ON m.Code = s.Code AND m.Month >= s.Month
Please note that months in this case are assumed to be unique per Code. If that is not the case, you may first need to create a view that groups the original dataset by Code, Month and rank that view instead of the original dataset.
* According to the order of key.
Related
I have a temporary query, e.g :
CREATE TEMPORARY TABLE IF NOT EXISTS table4 AS (select * from table1)
and then, i have a another table resulting from a query, like:
select column from table2
what I would to do is to concatenated this column as a new column on the temparary table. Inner join would not work because they dont have a commom column
This would be like the concatenate() on python with axis=0.
I would appreciate any help
If I understand correctly you want to add the concatenate results of the second query as another column of your temporary table. Doesn't make much sense without more context as why would you want the same results on the new column on every row. But here goes my solution:
CREATE TEMPORARY TABLE IF NOT EXISTS table4 AS
(
select
*,
(select group_concat(column) from table2 group by null) as concatcolumn
from
table1
)
I have grouped by NULL on the group_concat so that it groups through all the rows. Inside this "nested" (is it even called nested when inside a column definition?) you can add where conditions which would make this question make somewhat more sense. Hope this solution helps. Cheers,
*****EDIT****
Based on OP's comments and supposing that both tables have rows that are aligned (matching rows have same row number but no matching key). This was more difficult than I expected as this is tagged as MySQL but this DBRM has no ranking function. Here is what I came up with that is untested.
CREATE TEMPORARY TABLE IF NOT EXISTS table4 AS
(
select
t1.*,
t2.column
from
(
select t.*,#rownum := #rownum + 1 as rank from table1 t, (select #rownum := 0) r
) t1
join
(
select t.*,#rownum := #rownum + 1 as rank from table2 t, (select #rownum := 0) r
) t2
on
t1.rank = t2.rank
)
I have an app that reads from multiple mysql tables, but I'd like to put all the data into 1 table. Thing is, these tables have no linking fields... the app just sequentially processes the rows across the 3 tables, with the hope that the correct rows are lined up in each table (i.e. that row1 in table1 is applicable to row1 in table 2 and table3, and so on)
My tables are as follows:
Table1:
Name,Surname,ID,DoB
Table2:
Address,Town,State
Table3:
password
What I want is :
Table4:
Name,Surname,ID,DoB,Address,Town,State,password
I have created Table4 and I'm now trying to insert the values with a select query...
I've tried ...
SELECT
t1.Name,
t1.Surname,
t1.ID,
t1.DoB,
t2.Address,
t2.Town,
t2.State,
t3.password
FROM table1 AS t1,table2 AS t2, table3 AS t3;
...but this gives me duplicate rows cos there is no where clause. And since there's no linking fields, i can't use a JOIN statement, right?
I'm not a very experienced with SQL, so please help!
Well, officialy you're messed up. There is no first or last row in a RDBMS unless you use an ORDER BY clause. That's also what the manual states. If you issue a
SELECT * FROM your_table;
you can not be sure to get the result in the same order the rows were inserted or in the same order every time you issue the statement at all.
In practice on the other hand, most of the time you will get the same result and most of the time even in the same order the rows were inserted.
What you can do, is, to first slap the one who didn't think of putting a column in each table that determines a sort order (in the future use either an auto_increment column or a timestamp column that holds the date and time of insertion or whatever suits your needs) and second, (but really do this only if you have no other choice, as like I said it's unreliable) you can emulate a row number on which you can join.
SELECT * FROM (
SELECT table1.*, #rn1 := #rn1 + 1 as row_number FROM
table1,
(SELECT #rn1 := 0) v
) a
LEFT JOIN (
SELECT table2.*, #rn2 := #rn2 + 1 as row_number FROM
table2,
(SELECT #rn2 := 0) v
) b ON a.row_number = b.row_number
LEFT JOIN (
SELECT table3.*, #rn3 := #rn3 + 1 as row_number FROM
table3,
(SELECT #rn3 := 0) v
) c ON a.row_number = c.row_number
I am running following query to get rank of business in all categories in terms of total number of likes.
SET #rownum = 0;
SELECT b.*
, (
SELECT f4.rank from business as b2 INNER JOIN (
select count(*) count, #rownum:=#rownum + 1 as rank, f3.* from favourites as f3 GROUP BY f3.business_id ORDER BY count DESC ) as f4 ON b2.id = f4.business_id WHERE b2.id = 8 && f4.category_id=c.id
)
as rank FROM business as b, category c where b.id=8
rank give NULL after first row, what should I do to reset #rownum to 0 for next row?
To reset the #rownum user variable, you could try including an inline view (i.e. a derived table) in the FROM clause.
It looks like you would need that within the inner correlated subquery. That correlated subquery should get re-executed for every row from category c, or at least every distinct value of c.id. (I'm going to assume that the id column in each table is the primary key.)
e.g.
FROM ...
JOIN (SELECT #rownum := 0) r
WHERE ...
BUT... I am hesitant to recommend this approach to you, because I am having difficulty unwinding your SQL statement. It's not clear what resultset you want returned. It looks like that query should be throwing an exception, if that subquery returns more than one row. I just don't see anything explicit or implied that would give you that guarantee.
An example of the desired output would go a long ways to getting some useful help.
I am pretty sure you want ROW_NUMBER, RANK or DENSE_RANK partitioned by business_ID but I cannot penetrate your SQL
Some inputs & outputs would be helpful.
select * from
business as f4 inner join
(
select business_id, count(*),rank() over (partition by business_id order by count desc ) as rank) as counts
on f4.business_id=counts.business_id
might be close
It seems to me that your code should increment #rownum for every row in the result because the first subquery and therefore the joined subquery should be executed once for every row.
In my opinion, your query is equivalent to the following:
SELECT b.*, #rownum:=#rownum + 1 AS rank
FROM business AS b, category c
WHERE b.id=8
Edit: If the problem is that you need to reset #rownum in a subquery but you're limited to a single column in the result, use something like this construct:
SELECT IF(#rownum:=0, NULL, f4.rank) AS rank FROM ...
The condition #rownum:=0 is always evaluated, resetting rownum, and because it evaluates to 0, value of f4.rank is always returned.
What would be the best way to return one item from each id instead of all of the other items within the table. Currently the query below returns all manufacturers
SELECT m.name
FROM `default_ps_products` p
INNER JOIN `default_ps_products_manufacturers` m ON p.manufacturer_id = m.id
I have solved my question by using the DISTINCT value in my query:
SELECT DISTINCT m.name, m.id
FROM `default_ps_products` p
INNER JOIN `default_ps_products_manufacturers` m ON p.manufacturer_id = m.id
ORDER BY m.name
there are 4 main ways I can think of to delete duplicate rows
method 1
delete all rows bigger than smallest or less than greatest rowid value. Example
delete from tableName a where rowid> (select min(rowid) from tableName b where a.key=b.key and a.key2=b.key2)
method 2
usually faster but you must recreate all indexes, constraints and triggers afterward..
pull all as distinct to new table then drop 1st table and rename new table to old table name
example.
create table t1 as select distinct * from t2; drop table t1; rename t2 to t1;
method 3
delete uing where exists based on rowid. example
delete from tableName a where exists(select 'x' from tableName b where a.key1=b.key1 and a.key2=b.key2 and b.rowid >a.rowid) Note if nulls are on column use nvl on column name.
method 4
collect first row for each key value and delete rows not in this set. Example
delete from tableName a where rowid not in(select min(rowid) from tableName b group by key1, key2)
note that you don't have to use nvl for method 4
Using DISTINCT often is a bad practice. It may be a sing that there is something wrong with your SELECT statement, or your data structure is not normalized.
In your case I would use this (in assumption that default_ps_products_manufacturers has unique records).
SELECT m.id, m.name
FROM default_ps_products_manufacturers m
WHERE EXISTS (SELECT 1 FROM default_ps_products p WHERE p.manufacturer_id = m.id)
Or an equivalent query with IN:
SELECT m.id, m.name
FROM default_ps_products_manufacturers m
WHERE m.id IN (SELECT p.manufacturer_id FROM default_ps_products p)
The only thing - between all possible queries it is better to select the one with the better execution plan. Which may depend on your vendor and/or physical structure, statistics, etc... of your data base.
I think in most cases EXISTS will work better.
I've got a weird one, and I don't know if it's my syntax (which seems straightforward) or a bug (or just unsupported).
Here's my query that works but is needlessly slow:
UPDATE table1
SET table1column1 =
(SELECT COUNT(DISTINCT table2column1) FROM table2view WHERE table2column1 <= (SELECT table2column1 FROM table2 WHERE table2.id = table1.id) )
/
(SELECT COUNT(DISTINCT table2column1) FROM table2)
+ (SELECT COUNT(DISTINCT table2column2) FROM table2view WHERE table2column2 <= (SELECT table2column2 FROM table2 WHERE table2.id = table1.id) )
/
(SELECT COUNT(DISTINCT table2column2) FROM table2)
+ (SELECT COUNT(DISTINCT table2column3) FROM table2view WHERE table2column3 <= (SELECT table2column3 FROM table2 WHERE table2.id = table1.id) )
/ (SELECT COUNT(DISTINCT table2column3) FROM table2);
It's just the sum of three percentiles (of table2column1, table2column2, and table2column3) with duplicates removed.
Here's where it gets weird. I have to use a view for this to work on the subquery with the WHERE or it will only UPDATE the first row of table1, and set the rest of the rows' table1column1 to 0. That table2view is an exact duplicate of table2. Yeah, weird.
If I don't use DISTINCT, I can do it without the view. Does that make sense? Note: I have to have DISTINCT because I have lots of duplicates.
I tried making it SELECT only from the view, but that slowed it down worse.
Does anyone know what the problem is and the best way to rework this query so it doesn't take so long? It's in a TRIGGER, and the updated data is pretty on demand.
Many thanks in advance!
Details
I'm testing the speed in phpMyAdmin's command line.
I'm pretty sure the degradation is coming from the view since the more of the view and the less of the actual table I use, the slower it gets.
When I do the one without DISTINCT, it's lightning fast.
Only works on views?
OK, so I just set up a copy of table2. I tried first to do the original query substituting the view with the copy. No go.
I tried to do the query below with the copy instead of the view. No go.
Hopefully the introduction of these constants will better show what I'm trying to do.
SET #table2column1_distinct_count = (SELECT COUNT(DISTINCT table2column1) FROM table2);
SET #table2column2_distinct_count = (SELECT COUNT(DISTINCT table2column2) FROM table2);
SET #table2column3_distinct_count = (SELECT COUNT(DISTINCT table2column3) FROM table2);
UPDATE table1, table2
SET table1.table1column1 = (SELECT COUNT(DISTINCT table2column1) FROM table2view WHERE table2column1 <= table2.table2column1) / #table2column1_distinct_count
+ (SELECT COUNT(DISTINCT table2column2) FROM table2view WHERE table2column2 <= table2.table2column2) / #table2column2_distinct_count
+ (SELECT COUNT(DISTINCT table2column3) FROM table2view WHERE table2column3 <= table2.table2column3) / #table2column3_distinct_count
WHERE table1.id = table2.id;
Again, when I use table2 instead of the table2view, it only updates the first row properly and sets all other rows' table1.table1column1 = 0.
Math
I'm trying to set table1.table1column1 = to the sum of the percentiles of table2column1, table2column2, and table2column3 by id.
I do a percentile by (counting the distinct values of a table2columnX <= to the current table2columnX ) / (the total count of distinct table2columnXs).
I use DISTINCT to get rid of the excessive duplicates.
View
Here's the SELECT for the view. Does this help?
CREATE VIEW myTable.table2view AS SELECT
table2.table2column1 AS table2column1,
table2.table2column2 AS table2column2,
table2.table2column2 AS table2column3,
FROM table2
GROUP BY table2.id;
Is there something special about the GROUP BY in the view's SELECT that makes this work (that I'm not seeing)?
I would probably say that the query is slow because it is repeatedly accessing the table when the trigger fires.
I am no SQL expert but I have tried to put together a query using temporary tables. You can see if it helps speed up the query. I have used different but similar sounding column names in my code sample below.
EDIT : There was a calculation error in my earlier code. Updated now.
SELECT COUNT(id) INTO #no_of_attempts from tb2;
-- DROP TABLE IF EXISTS S1Percentiles;
-- DROP TABLE IF EXISTS S2Percentiles;
-- DROP TABLE IF EXISTS S3Percentiles;
CREATE TEMPORARY TABLE S1Percentiles (
s1 FLOAT NOT NULL,
percentile FLOAT NOT NULL DEFAULT 0.00
);
CREATE TEMPORARY TABLE S2Percentiles (
s2 FLOAT NOT NULL,
percentile FLOAT NOT NULL DEFAULT 0.00
);
CREATE TEMPORARY TABLE S3Percentiles (
s3 FLOAT NOT NULL,
percentile FLOAT NOT NULL DEFAULT 0.00
);
INSERT INTO S1Percentiles (s1, percentile)
SELECT A.s1, ((COUNT(B.s1)/#no_of_attempts)*100)
FROM (SELECT DISTINCT s1 from tb2) A
INNER JOIN tb2 B
ON B.s1 <= A.s1
GROUP BY A.s1;
INSERT INTO S2Percentiles (s2, percentile)
SELECT A.s2, ((COUNT(B.s2)/#no_of_attempts)*100)
FROM (SELECT DISTINCT s2 from tb2) A
INNER JOIN tb2 B
ON B.s2 <= A.s2
GROUP BY A.s2;
INSERT INTO S3Percentiles (s3, percentile)
SELECT A.s3, ((COUNT(B.s3)/#no_of_attempts)*100)
FROM (SELECT DISTINCT s3 from tb2) A
INNER JOIN tb2 B
ON B.s3 <= A.s3
GROUP BY A.s3;
-- select * from S1Percentiles;
-- select * from S2Percentiles;
-- select * from S3Percentiles;
UPDATE tb1 A
INNER JOIN
(
SELECT B.tb1_id AS id, (C.percentile + D.percentile + E.percentile) AS sum FROM tb2 B
INNER JOIN S1Percentiles C
ON B.s1 = C.s1
INNER JOIN S2Percentiles D
ON B.s2 = D.s2
INNER JOIN S3Percentiles E
ON B.s3 = E.s3
) F
ON A.id = F.id
SET A.sum = F.sum;
-- SELECT * FROM tb1;
DROP TABLE S1Percentiles;
DROP TABLE S2Percentiles;
DROP TABLE S3Percentiles;
What this does is that it records the percentile for each score group and then finally just updates the tb1 column with the requisite data instead of recalculating the percentile for each student row.
You should also index columns s1, s2 and s3 for optimizing the queries on these columns.
Note: Please update the column names according to your db schema. Also note that each percentile calculation has been multiplied by 100 as I believe that percentile is usually calculated that way.