mysql: how do I ignore all rows that contain (value column A), if one of these rows has a specific value in column B? - mysql

I am looking for a way to filter not only the duplicate rows, but also the "initial" row. The goal is to have a clean list of all positions. The list is used by sales / accounting to see open positions, thats why the initial "Invoice" position has to be removed as well if a "Cancellcation" exists for that invoice.
I've tried solutions with group by, subqueries and EXISTS, but can't get the expected result. Ideally, I get this to work as an additional filter inside the where clause.
Default
ID
Nr
Type
Amount
1
NR-100
Invoice
100
2
NR-101
Invoice
200
3
NR-102
Invoice
300
4
NR-100
Cancellation
100
5
NR-102
Cancellation
300
6
NR-103
Invoice
150
Expected results
ID
Nr
Type
Amount
2
NR-101
Invoice
200
6
NR-103
Invoice
150

EXISTence test would seem to be the way to go so I wonder what problem you had with it..
select *
from t
where type = 'invoice' and
not exists (select 1 from t t1 where t1.nr = t.nr and t1.type = 'cancellation')

Related

How to add a tag based on a column value

I'm trying to join two tables and select certain columns to display in the output including a 'flag' if a certain transaction amount is greater than or equal to 100. The flag would return a 1 if it is, else null.
I thought I could achieve this using a CASE in my SELECT but it only returns one record every time since it returns the first record that meets this condition. How do I just create this 'FLAG' column during my join easily?
SELECT payment_id, amount, type,
CASE
WHEN amount >= 100 THEN 1
ELSE NULL
END AS flag
FROM trans JOIN customers ON (user_id = cust_id)
JOIN bank ON (trans.bank = bank.id)
WHERE (error is false)
I expect an output such as:
payment_id amount type flag
1 81 3 NULL
2 104 2 1
3 150 2 1
4 234 1 1
However, I'm only getting the first record such as:
payment_id amount type flag
2 104 2 1
I tried your table structure in my local and it is working perfectly.
I need one thing from you is in which table you are having error column.
If I comment where condition then it is working fine.
If you're getting fewer rows than you expect, it's either due to:
Join condition
You're doing a INNER joins to the customers and bank tables. If you have 4 source rows in your trans table, but only one row that matches in your customers table (condition user_id = cust_id), then you will only have one row returned.
The same goes for the subsequent join to your bank table. If there you somehow have a transaction that references a bank which is not defined in the bank table, then you won't see a record for this row.
WHERE clause
Obviously you won't see any rows that don't meet the conditions specified here.
It's probably #1 -- check to see if the rows with payment_id IN (1,3,4) have corresponding user id values in the user table and corresponding bank id values in the banks table.

Comparing two successive rows

I have Database table payment such as below
level_count | amount
__________________________
650 | 12
1000 | 35
1700 | 50
__________________________
Now Wanted to check if I supplied input as 650 which is level_count column value. Then I should get amount as 12. Then If I supplied input as 999 I should still get 12. Means It should compare its successive rows and compare. Suppose If I enter 1200 then I should get 35 and If I enter 1700 or above I should get 50.
I have tried flowing but didn't got any success.
Where I am going wrong.
SELECT * FROM payment T1
INNER JOIN payment T2 on T1.level_count>=T2.level_count AND T1.level_count<T2.level_count
WHERE T1.level_count = '650'
When I execute above query I get no results.
You may try using a LIMIT query here:
SELECT *
FROM payment
WHERE level_count <= 999 -- or 650, or another input value
ORDER BY level_count DESC
LIMIT 1;
The logic of the above query works in two parts. First, the WHERE clause removes all records for which the level_count is greater than the input value. But this still leaves us potentially with more than one record (which include the record we actually want). The LIMIT trick then keeps the single remaining record with the highest level_count.

SQL Validate a column with the same column

I have the following situation. I have a table with all info of article. I will like to compare the same column with it self. because I have multiple type of article. Single product and Master product. the only way that I have to differences it, is by SKU. for example.
ID | SKU
1 | 11111
2 | 11112
3 | 11113
4 | 11113-5
5 | 11113-8
6 | 11114
7 | 11115
8 | 11115-1-W
9 | 11115-2
10 | 11116
I only want to list or / and count only the sku that are full unique. follow th example the sku that are unique and no have variant are (ID = 1, 2, 6 and 10) I will want to create a query where if 11113 are again on the column not cout it. so in total I will be 4 unique sku and not "6 (on total)". Please let me know. if this are possible.
Assuming the length of master SKUs are 5 characters, try this:
select a.*
from mytable a
left join mytable b on b.sku like concat(a.sku, '%')
where length(a.sku) = 5
and b.sku is null
This query joins master SKUs to child ones, but filters out successful joins - leaving only solitary master SKUs.
You can do this by grouping and counting the unique rows.
First, we will need to take your table and add a new column, MasterSKU. This will be the first five characters of the SKU column. Once we have the MasterSKU, we can then GROUP BY it. This will bundle together all of the rows having the same MasterSKU. Once we are grouping we get access to aggregate functions like COUNT(). We will use that function to count the number of rows for each MasterSKU. Then, we will filter out any rows that have a COUNT() over 1. That will leave you with only the unique rows remaining.
Take that unique list and LEFT JOIN it back into your original table to grab the IDs.
SELECT ID, A.MasterSKU
FROM (
SELECT
MasterSKU = SUBSTRING(SKU,1,5),
MasterSKUCount = COUNT(*)
FROM MyTable
GROUP BY SUBSTRING(SKU,1,5)
HAVING COUNT(*) = 1
) AS A
LEFT JOIN (
SELECT
ID,
MasterSKU = SUBSTRING(SKU,1,5)
FROM MyTable
) AS B
ON A.MasterSKU = B.MasterSKU
Now one thing I noticed from you example. The original SKU column really looks like three columns in one. We have multiple values being joined with hypens.
11115-1-W
There may be a reason for it, but most likely this violates first normal form and will make the database hard to query. It's part of the reason why such a complicated query is needed. If the SKU column really represents multiple things then we may want to consider breaking it out into MasterSKU, Version, and Color or whatever each hyphen represents.

how to use group by on this table

here is a screen shot of my table
I am trying to remove all those rows whose sum of PostAmt comes to be 0 when grouped by the sales_contract_nbr and the name.
for example :
the sales_contract_nbr 51101008103 will be removed when grouped by name and sales_contract_nbr as -96.83 and 96.83 when summed up amounts to 0.
Quite simple right?
but what I want apart from this is that I want to remove the contracts in group. I mean if the contract 51101008195 is grouped it amounts to be 533.87 which won't be removed (highlighted)
But I want to remove it in groups
for example
two rows of contract number 51101008195 should be summed first (see the image below) I mean the amount -533.87 and 533.87 should be summed to get the total of 0. Only one record for the contract should be left.
Update
More Description :
what i want to do is first group the row number 1 and 2 (matching amounts one positive and the other negative) and then group the others. If there were 4 rows of the same contract number then the row 1 and row 2 should have been grouped then the row 3 and row 4 should be grouped if there absolute amounts are same if not the row number 3 and 4 doesn't get deleted.
I want to use group by to eliminate the rows whose total ends up to be 0 and which have the same name or the contract number.
I hope I have made the question clear. If not please ask.
how can it be done?
what i am doing till now is :
SELECT sales_contract_nbr
,name
,SUM(PostAmt) PostAmt
FROM tblMasData
GROUP BY sales_contract_nbr, name
thanks.
Here is what I come up with for now :
SELECT location
,sales_contract_nbr
,name
,SUM(absPostAmt * nbPostAmt) / ABS(SUM(nbPostAmt)) PostAmt
,ABS(SUM(nbPostAmt)) nbPostAmt
,SUM(absPostAmt * nbPostAmt) PostAmtTotal
FROM (
SELECT location
,sales_contract_nbr
,name
,PostAmt
,ABS(PostAmt) absPostAmt
,SUM(CASE WHEN PostAmt >= 0 THEN 1 ELSE -1 END) nbPostAmt
FROM tblMasData
GROUP BY location
,sales_contract_nbr
,name
,PostAmt
,ABS(PostAmt)
) t
GROUP BY location
,sales_contract_nbr
,name
,absPostAmt
HAVING SUM(absPostAmt * nbPostAmt) != 0
See SQLFiddle.
This doesn't totally answer your question, as if you have 100 + 100 - 200 for instance, it won't hide all three rows. But it can be pretty messy to find combinations which equal to 0 among a bunch of rows.
More, if some rows have the same amount, they will be grouped. That's why I added a column counting those rows being equal, and a column summing them up at the end.
This should at least allow you to deal with the data programmatically.
Let me know if this fills your needs, or if you need some improvement (which could involve some not so pretty SQL).

Obtain running frequency distribution from previous N rows of MySQL database

I have a MySQL database where one column contains status codes. The column is of type int and the values will only ever be 100,200,300,400. It looks like below; other columns removed for clarity.
id | status
----------------
1 300
2 100
3 100
4 200
5 300
6 300
7 100
8 400
9 200
10 300
11 100
12 400
13 400
14 400
15 300
16 300
The id field is auto-generated and will always be sequential. I want to have a third column displaying a comma-separated string of the frequency distribution of the status codes of the previous 10 rows. It should look like this.
id | status | freq
-----------------------------------
1 300
2 100
3 100
4 200
5 200
6 300
7 100
8 400
9 300
10 300
11 100 300,100,200,400 -- from rows 1-10
12 400 100,300,200,400 -- from rows 2-11
13 400 100,300,200,400 -- from rows 3-12
14 400 300,400,100,200 -- from rows 4-13
15 300 400,300,100,200 -- from rows 5-14
16 300 300,400,100 -- from rows 6-15
I want the most frequent code listed first. And where two status codes have the same frequency it doesn't matter to me which is listed first but I did list the smaller code before the larger in the example. Lastly, where a code doesn't appear at all in the previous ten rows, it shouldn't be listed in the freq column either.
And to be very clear the row number that the frequency string appears on does NOT take into account the status code of that row; it's only the previous rows.
So what have I done? I'm pretty green with SQL. I'm a programmer and I find this SQL language a tad odd to get used to. I managed the following self-join select statement.
select *, avg(b.status) freq
from sample a
join sample b
on (b.id < a.id) and (b.id > a.id - 11)
where a.id > 10
group by a.id;
Using the aggregate function avg, I can at least demonstrate the concept. The derived table b provides the correct rows to the avg function but I just can't figure out the multi-step process of counting and grouping rows from b to get a frequency distribution and then collapse the frequency rows into a single string value.
Also I've tried using standard stored functions and procedures in place of the built-in aggregate functions, but it seems the b derived table is out of scope or something. I can't seem to access it. And from what I understand writing a custom aggregate function is not possible for me as it seems to require developing in C, something I'm not trained for.
Here's sql to load up the sample.
create table sample (
id int NOT NULL AUTO_INCREMENT,
PRIMARY KEY(id),
status int
);
insert into sample(status) values(300),(100),(100),(200),(200),(300)
,(100),(400),(300),(300),(100),(400),(400),(400),(300),(300),(300)
,(100),(400),(100),(100),(200),(500),(300),(100),(400),(200),(100)
,(500),(300);
The sample has 30 rows of data to work with. I know it's a long question, but I just wanted to be as detailed as I could be. I've worked on this for a few days now and would really like to get it done.
Thanks for your help.
The only way I know of to do what you're asking is to use a BEFORE INSERT trigger. It has to be BEFORE INSERT because you want to update a value in the row being inserted, which can only be done in a BEFORE trigger. Unfortunately, that also means it won't have been assigned an ID yet, so hopefully it's safe to assume that at the time a new record is inserted, the last 10 records in the table are the ones you're interested in. Your trigger will need to get the values of the last 10 ID's and use the GROUP_CONCAT function to join them into a single string, ordered by the COUNT. I've been using SQL Server mostly and I don't have access to a MySQL server at the moment to test this, but hopefully my syntax will be close enough to at least get you moving in the right direction:
create trigger sample_trigger BEFORE INSERT ON sample
FOR EACH ROW
BEGIN
DECLARE _freq varchar(50);
SELECT GROUP_CONCAT(tbl.status ORDER BY tbl.Occurrences) INTO _freq
FROM (SELECT status, COUNT(*) AS Occurrences, 1 AS grp FROM sample ORDER BY id DESC LIMIT 10) AS tbl
GROUP BY tbl.grp
SET new.freq = _freq;
END
SELECT id, GROUP_CONCAT(status ORDER BY freq desc) FROM
(SELECT a.id as id, b.status, COUNT(*) as freq
FROM
sample a
JOIN
sample b ON (b.id < a.id) AND (b.id > a.id - 11)
WHERE
a.id > 10
GROUP BY a.id, b.status) AS sub
GROUP BY id;
SQL Fiddle