Update Column Using ROW_Number() function. But it is failing. Could Any one suggest a solution? - sql-server-2008

I know guys, this might be a silly question, but I have not found any solution till now, so I am asking this question with all the inputs and outputs that I have done. Could anyone provide me the solution.
What I want to do is: the parcelno can have one or more invoicenumbers, I want to find how many invoice numbers does an parcel has and give it a rank. The ranking part is important because my further work is depending on this column.
I have one table named TableA. It has three columns Invoicenumber which is the unique id, ParcelNo which can be duplicate and Ranking which I want to update.
CREATE TABLE TableA
(
Invoicenumber varchar(5),
ParcelNo varchar(5),
Ranking bit,
IDate Datetime
)
INSERT INTO TableA (Invoicenumber, ParcelNo)
VALUES ('INV01', 'P0001'), ('INV02', 'P0001'),
('INV03', 'P0002'), ('INV04', 'P0002'),
('INV05', 'P0003'), ('INV06', 'P0003')
When I run the following query the output is as desired.
;WITH CTE AS
(
SELECT
*,
ROW_NUMBER() OVER(PARTITION BY PARCELNO ORDER BY INVOICENUMBER) AS RWNO
FROM
TableA
)
SELECT
T.*, C.RWNO
FROM CTE C
JOIN TableA T ON T.Invoicenumber = C.Invoicenumber
The output is below:
So, I tried to update the Ranking column in Table A.
I run this query to do so:
;WITH CTE AS
(
SELECT
*,
ROW_NUMBER() OVER(PARTITION BY PARCELNO ORDER BY INVOICENUMBER) AS RWNO
FROM
TableA
)
UPDATE T
SET Ranking = C.RWNO
FROM CTE C
JOIN TableA T ON T.Invoicenumber = C.Invoicenumber
But the output is wrong. The column is not updated as expected.
Below is the output of the updated column:
Why is the Ranking column is updated incorrectly?
I want to update the column to prepare some data. This table is sample for the explanation.
I am elaborating my issue below:-
Below in the image are two tables:-
Table A and Table B has IDate column.
I want to update the IDate column in A from B. But the dates should be unique. First date should not be repeated. These date are associated with Invoicenumbers.

I think what you really want is a calculated column (called a calculated field or generated field). I'm guessing that your parcel number should point to another table that stores information about the parcels. If that's the case, then go with:
-- First approach
CREATE TABLE Parcels (
id int IDENTITY (1,1) NOT NULL,
ParcelNo varchar(5),
Description varchar(max)
-- Ranking AS (SELECT COUNT(*) FROM Invoices i WHERE i.ParcelID = id)
);
CREATE TABLE Invoices (
id int IDENTITY (1,1) NOT NULL,
InvoiceNumber varchar(5),
ParcelID int FOREIGN KEY REFERENCES Parcels(id)
);
ALTER TABLE Parcels ADD Ranking AS (SELECT COUNT(*) FROM Invoices i WHERE i.ParcelID = id);
INSERT INTO Parcels
(ParcelNo)
VALUES
('P0001'),
('P0001'),
('P0002'),
('P0003');
INSERT INTO Invoices
(InvoiceNumber, ParcelID)
VALUES
('INV01', (SELECT p.id FROM Parcels p WHERE p.ParcelNo = 'P0001')),
('INV02', (SELECT p.id FROM Parcels p WHERE p.ParcelNo = 'P0001')),
('INV03', (SELECT p.id FROM Parcels p WHERE p.ParcelNo = 'P0002')),
('INV04', (SELECT p.id FROM Parcels p WHERE p.ParcelNo = 'P0002')),
('INV05', (SELECT p.id FROM Parcels p WHERE p.ParcelNo = 'P0003')),
('INV06', (SELECT p.id FROM Parcels p WHERE p.ParcelNo = 'P0003'));
On the other hand, if you really want all the data in a single table, then try this:
-- Second approach
CREATE TABLE TableA (
Invoicenumber varchar(5),
ParcelNo varchar(5),
Ranking AS (SELECT COUNT(*) FROM TableA a WHERE a.ParcelNo = ParcelNo)
)
Some notes:
Both of my approaches assume that by ranking, you mean that you want a count of how many invoices are in a parcel.
My first approach has a circular reference, because the Invoices table has a foreign key into the Parcels table, but the Parcels table tabulates information from the Invoices table. That's why I commented out the calculated field in the first table, then added the calculated field back in after creating both tables.
Notice that I capitalized all SQL keywords (except the types such as varchar). It's easier to read SQL if you either go with all caps or no caps for an entire query.
Notice my semicolons at the end of each logical break. Semi-colons are technically optional, but a lot of folks consider using them to be good practice.
For my first approach, I'm using a foreign key. You can read more about those here.
Because my first approach split the table into 2 tables, I needed to somehow know the id of the Parcels table when populating the Invoices table, even though the ids are given by the database (so I can't know them ahead of time). Those select statements accomplish that.
My syntax should work with SQL Server, but no necessarily with any other DBMS. That's because calculated fields are not ANSI standard.

Related

Stored procedure for finding score up until given date

edit: owngoals are goals made by the team total. Othergoals are goals made by other team total
I have 2 tables, a TEAMS and MATCHES table.
create table teams
(
Id char(3) primary key,
name varchar(40),
nomatches int,
owngoals int,
othergoals int,
points int
)
and
create table matches
(
id int identity(1,1),
homeid char(3) foreign key references teams(id),
outid char(3) foreign key references teams(id),
homegoal int,
outgoal int,
matchdate date
)
i have triggers for inserting and deleting that gives the correct score for the teams.
the select to show the scoreboard looks like this.
select name, nomatches, owngoals, othergoals, points from teams
order by points desc
and it will give this result, with the numbers being Number of matches, goals made by team, goals made against you, total points
Now i need to make a stored procedure that makes the scoreboard but only until a given date. I have tried some different stuff like making a copy of the Teams table as a #tmpTeams, but nothing has worked.
Do you just want an aggregation query? Your question doesn't explain what the logic is for the columns, but I'm guessing something like this:
select t.id, t.name,
sum(mh.homegoal) as homegoals,
sum(mo.outgoal) as othergoals,
coalesce(sum(mh.homegoal), 0) + coalesce(sum(mo.outgoal), 0) as totalpoints
kfrom teams t left join
matches mh
on mh.homeid = t.id and
mh.matchdate >= ? left join
matches mo
on mo.outid = t.id and
mo.matchdate >= ?
group by t.id, t.name;
The ? is intended to be a parameter for your date.
Note: I seen no reason to have triggers when this information can be calculated using a query -- unless you have a performance issue that requires triggers to solve.

sql table design to fetch records with multiple inclusion and exclusion conditions

We want to select customers based on following parameters i.e. customer should be in:
specific city i.e. cityId=1,2,3...
specific customerId should be excluded i.e. customerId=33,2323,34534...
specific age i.e. 5 years, 7 years, 72 years...
This inclusion & exclusion list can be any long.
How should we design database for this:
Create separate table 'customerInclusionCities' for these inclusion cities and do like:
select * from customers where cityId in (select cityId from customerInclusionCities)
Some we do for age, create table 'customerEligibleAge' with all entries of eligible age entries:
i.e. select * from customers where age in (select age from customerEligibleAge)
and Create separate table 'customerIdToBeExcluded' for excluding customers:
i.e. select * from customers where customerId not in (select customerId from customerIdToBeExcluded)
OR
Create One table with Category and Ids.
i.e. Category1 for cities, Category2 for CustomerIds to be excluded.
Which approach is better, creating one table for these parameters OR creating separate tables for each list i.e. age, customerId, city?
IN ( SELECT ... ) can be very slow. Do your query as a single SELECT without subqueries. I assume all 3 columns are in the same table? (If not, that adds complexity.) The WHERE clause will probably have 3 IN ( constants ) clauses:
SELECT ...
FROM tbl
WHERE cityId IN (1,2,3...)
AND customerId NOT IN (33,2323,34534...)
AND age IN (5, 7, 72)
Have (at least):
INDEX(cityId),
INDEX(age)
(Negated things are unlikely to be able to use an index.)
The query will use one of the indexes; having both will give the Optimizer a choice of which it thinks is better.
Or...
SELECT c.*
FROM customers AS c
JOIN cityEligible AS b ON b.city = c.city
JOIN customerEligibleAge AS ce ON c.age = ce.age
LEFT JOIN customerIdToBeExcluded AS ex ON c.customerId = ex.customerId
WHERE ex.customerId IS NULL
Suggested indexes (probably as PRIMARY KEY):
customers: (city)
customerEligibleAge: (age)
customerIdToBeExcluded: (customerId)
In order to discuss further, please provide SHOW CREATE TABLE for each table and EXPLAIN SELECT ... for any of the queries actually work.
If you use the database only that operation, I recommend to use the first solution. Also the first solution is very simple to deploy.
The second solution fills up with junk the DB.

Mysql Obtaining actual value through FK when Selecting all rows

I need to select * FROM sections and get the column values for every row to fill a JTable. My problem is that my adviserId column on section table is an INT
And because I'm getting the result set of every column on every row, I cannot issue a WHERE clause. I thought of subquery but since Id is different on every row, no predetermined Id can be supplied on WHERE clause.
So If I run my stored procedure, I get just an int value for adviserId instead of the teacher's name.
I have teachers and sections table.
Teacher
id PK INT
lastName
firstName
middleName
isAdviser
status
Sections
id PK
name
adviserId FK-- REFERENCING `id` column ON teacher table
What would be the best approach? I hope you can help.
Thanks.
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
I've created the final stored procedure based on everyone's suggestion. (THANKS AGAIN all.)
CREATE DEFINER=`root`#`localhost` PROCEDURE `getAllSectionsInfo`()
BEGIN
SELECT
s.`name` AS `Section Name`,
s.`session` AS `Session`,
CONCAT(t.lastName,',',t.firstName,' ',t.middleName) AS Adviser,
s.yearLevel AS `Year Level`,
CONCAT(syStart,'-',syEnd) AS SchoolYear
FROM sections s
INNER JOIN
teacher t on s.adviserId = t.id;
END
Yes I also think the same, that a simple inner join will do your job. Try the below example..
create table JTable as select T.id as Tid,T.lastName,T.firstName,T.middleName,T.isAdviser,T.status,S.id as Sid,S.name,S.adviserId
from Sections as S
inner join Teachers as T on T.id = S.adviserId
You can apply left join here to make sure that you have all records of Section table either related to Teachers data or with null data.
So, now the JTable will have all the columns in that you have put on the selection list.
Below is solution for db data selection
SELECT * FROM sections s INNER JOIN teacher on s.adviserId = t.id

Sql: choose all baskets containing a set of particular items

Eddy has baskets with items. Each item can belong to arbitrary number of baskets or can belong to none of them.
Sql schema to represent it is as following:
tbl_basket
- basketId
tbl_item
- itemId
tbl_basket_item
- pkId
- basketId
- itemId
Question: how to select all baskets containing a particular set of items?
UPDATE. Baskets with all the items are needed. Otherwise it would have been easy task to solve.
UPDATE B. Have implemented following solution, including SQL generation in PHP:
SELECT basketId
FROM tbl_basket
JOIN (SELECT basketId FROM tbl_basket_item WHERE itemId = 1 ) AS t0 USING(basketId)
JOIN (SELECT basketId FROM tbl_basket_item WHERE itemId = 15 ) AS t1 USING(basketId)
JOIN (SELECT basketId FROM tbl_basket_item WHERE itemId = 488) AS t2 USING(basketId)
where number of JOINs equals to number of items.
That works good unless some of the items are included in almost every basket. Then performance drops dramatically.
UPDATE B+. To resolve performance issues heuristic is applied. First you select frequency of each item. If it exceeds some threshold, you don't include it in JOINs and either:
apply post-filtering in PHP
or just don't apply filter by particular itemId, giving a user approximate results in a resonable amount of time
UPDATE B++. Seems that current problem have no nice solution in MySQL. This point raises one question and one solution:
(question) Does PostgreSQL have some advanced indexing techniques which allows to solve this problem without doing a full scan?
(solution) Seems that it could be solved nicely in Redis using sets and SINTER command to get an intersection.
I think the best way is to create a temporary table with the set of needed items (procedure that takes the item ids as parameters or something along those lines) and then left join it with all of the above tables joined together.
If for a given basketid you have NO nulls on the right side of the left join, the basket contains all the needed items.
-- the table definitions
CREATE TABLE basket ( basketid INTEGER NOT NULL PRIMARY KEY);
CREATE TABLE item ( itemid INTEGER NOT NULL PRIMARY KEY);
CREATE TABLE basket_item
( basketid INTEGER NOT NULL REFERENCES basket (basketid)
, itemid INTEGER NOT NULL REFERENCES item (itemid)
, PRIMARY KEY (basketid, itemid)
);
-- the query
SELECT * FROM basket b
WHERE NOT EXISTS (
SELECT * FROM item i
WHERE i.itemid IN (1,15,488)
AND NOT EXISTS (
SELECT * FROM basket_item bi
WHERE bi.basketid = b.basketid
AND bi.itemid = i.itemid
)
);
If you are going to provide the list of items, then edit id1, id2, etc. in below query:
select distinct t.basketId
from tbl_basket_item as t
where t.itemID in (id1, id2)
will give all baskets containing a set of items. No need to join any other tables as your requirements don't need them.
The simplest solution is to use HAVING clause.
SELECT basketId
FROM tbl_basket
WHERE itemId IN (1,15,488)
HAVING Count(DISTINCT itemId) = 3 --DISTINCT in case we have duplicate items in a basket
GROUP BY basketId

Delete from a table matching one criteria where there are rows in same table matching different criteria?

Sorry for the mega title... I was trying to be descriptive enough. I've got a table that contains event attendance data that has some erroneous data in it. The table definition is kind of like this:
id (row id)
date
company_name
attendees
It ended up with some cases where for a given date, there are two entries matching a company_name and date but one has attendees=0 and the other has attendees>0. In those cases, I want to discard the ones where attendees=0.
I know you can't join on the same table while deleting, so please consider this query to be pseudocode that shows what I want to accomplish.
DELETE FROM attendance a WHERE a.attendees=0 AND a.date IN (SELECT b.date FROM attendance b WHERE b.attendees > 0 AND b.company_name = a.company_name);
I also tried to populate a temporary table with the ids of the rows I want to delete, but that query hangs because of the IN (SELECT ...) clause. My table has thousands of rows so that just maxes out the CPU and then times out.
This ugly thing should work (using alias permit to avoid the You can't specify target table for update in FROM clause error)
DELETE FROM attendance
WHERE (attendees, date, company_name)
IN (SELECT c.a, c.d, c.c
FROM
(SELECT MIN(attendees) a, date d, company_name c
FROM attendance
GROUP BY date, company_name
HAVING COUNT(*) > 1) as c);
SqlFiddle