mysql - detecting identical rows when theres no primary key - mysql

I'm dealing with a situation at work where someone set up some tables without a primary key (arghhh). now I'm trying to compare dev data to prod data and the best I can tell is that theres a slight difference in the number of rows.
I figured the best way I could compare data is to do a join on every column, but I'm getting unexpected results.
to test this out I just joined the same table to itself. This table has 1309 rows.. but when I join on every column i get 1014. Thats less rows. If anything i'd expect to get more. What gives?
select *
from `default_tire_classifications` tc1
join `default_tire_classifications` tc2
on tc1.`marketing_tread_name` = tc2.`marketing_tread_name`
AND tc1.`size` = tc2.`size`
AND tc1.product_category = tc2.product_category
AND tc1.application = tc2.application
AND tc1.vehicle_type = tc2.vehicle_type
AND tc1.oem_part = tc2.oem_part
AND tc1.position = tc2.position
AND tc1.size = tc2.size
AND tc1.sect_wdth = tc2.sect_wdth
AND tc1.aspect_ratio = tc2.aspect_ratio
AND tc1.rim_size = tc2.rim_size
AND tc1.speed_rating = tc2.speed_rating
AND tc1.load_index = tc2.load_index

I suspect some of the columns contain NULL values. An equality comparison to a NULL value yields NULL. (SQL tri-valued Boolean logic.)
To do a comparison that yields TRUE when both sides are NULL, you could do something like this
( tc1.col = tc2.col OR ( tc1.col IS NULL AND tc2.col IS NULL ) )
MySQL also provides a non-standard "null-safe" equality comparison operator <=> (spaceship) that does the same thing.
tc1.col <=> tc2.col
returns either TRUE or FALSE, and will return TRUE when the values on both sides are NULL.
So, replacing the = (equality comparison) operator with <=> operator should resolve the problem with comparing NULL values.
(This isn't to say that NULL values is a problem, or is the only problem.)

If your fields contains NULL, you will lost them. Because NULL = NULL is not true.

Related

MySQL LEFT JOIN with WHERE function-call produces wrong result

From MySQL 5.7 I am executing a LEFT JOIN, and the WHERE clause calls a user-defined function of mine. It fails to find a matching row which it should find.
[Originally I simplified my actual code a bit for the purpose of this post. However in view of a user's proposed response, I post the actual code as it may be relevant.]
My user function is:
CREATE FUNCTION `jfn_rent_valid_email`(
rent_mail_to varchar(1),
agent_email varchar(45),
contact_email varchar(60)
)
RETURNS varchar(60)
BEGIN
IF rent_mail_to = 'A' AND agent_email LIKE '%#%' THEN
RETURN agent_email;
ELSEIF contact_email LIKE '%#%' THEN
RETURN contact_email;
ELSE
RETURN NULL;
END IF
END
My query is:
SELECT r.RentCode, r.MailTo, a.AgentEmail, co.Email,
jfn_rent_valid_email(r.MailTo, a.AgentEmail, co.Email)
AS ValidEmail
FROM rents r
LEFT JOIN contacts co ON r.RentCode = co.RentCode -- this produces one match
LEFT JOIN link l ON r.RentCode = l.RentCode -- there will be no match in `link` on this
LEFT JOIN agents a ON l.AgentCode = a.AgentCode -- there will be no match in `agents` on this
WHERE r.RentCode = 'ZAKC17' -- this produces one match
AND (jfn_rent_valid_email(r.MailTo, a.AgentEmail, co.Email) IS NOT NULL)
This produces no rows.
However. When a.AgentEmail IS NULL if I only change from
AND (jfn_rent_valid_email(r.MailTo, a.AgentEmail, co.Email) IS NOT NULL)
to
AND (jfn_rent_valid_email(r.MailTo, NULL, co.Email) IS NOT NULL)
it does correctly produce a matching row:
RentCode, MailTo, AgentEmail, Email, ValidEmail
ZAKC17, N, <NULL>, name#email, name#email
So, when a.AgentEmail is NULL (from non-matching LEFT JOINed row), why in the world does passing it to the function as a.AgentEmail act differently from passing it as a literal NULL?
[BTW: I believe I have used this kind of construct under MS SQL server in the past and it has worked as I would expect. Also, I can reverse the test of AND (jfn_rent_valid_email(r.MailTo, a.AgentEmail, co.Email) IS NOT NULL) to AND (jfn_rent_valid_email(r.MailTo, a.AgentEmail, co.Email) IS NULL) yet I still get no match. It's as though any reference to a.... as a parameter to the function causes no matching row...]
Most likely this is an issue with optimizer turning the LEFT JOIN into a INNER JOIN. The optimizer may do this when it believes that the WHERE-condition is always false for the generated NULL row (which it in this case is not).
You can take a look at the query plan with the EXPLAIN command, you will likely see different table order depending on the query variation.
If the actual logic of the function is to check all emails with one function call, you may have better luck with using a function that takes just one email address as parameter and use that for each email-column.
You can try without the function:
SELECT r.RentCode, r.MailTo, a.AgentEmail, co.Email,
jfn_rent_valid_email(r.MailTo, a.AgentEmail, co.Email)
AS ValidEmail
FROM rents r
LEFT JOIN contacts co ON r.RentCode = co.RentCode -- this produces one match
LEFT JOIN link l ON r.RentCode = l.RentCode -- there will be no match in `link` on this
LEFT JOIN agents a ON l.AgentCode = a.AgentCode -- there will be no match in `agents` on this
WHERE r.RentCode = 'ZAKC17' -- this produces one match
AND ((r.MailTo='A' AND a.AgentEmail LIKE '%#%') OR co.Email LIKE '%#%' )
Or wrap the function in a subquery:
SELECT q.RentCode, q.MailTo, q.AgentEmail, q.Email, q.ValidEmail
FROM (
SELECT r.RentCode, r.MailTo, a.AgentEmail, co.Email,
jfn_rent_valid_email(r.MailTo, a.AgentEmail, co.Email) AS ValidEmail
FROM rents r
LEFT JOIN contacts co ON r.RentCode = co.RentCode -- this produces one match
LEFT JOIN link l ON r.RentCode = l.RentCode -- there will be no match in `link` on this
LEFT JOIN agents a ON l.AgentCode = a.AgentCode -- there will be no match in `agents` on this
WHERE r.RentCode = 'ZAKC17' -- this produces one match
) as q
WHERE q.ValidEmail IS NOT NULL
Changing the call to the function in the WHERE clause to read
jfn_rent_valid_email(r.MailTo, IFNULL(a.AgentEmail, NULL), IFNULL(co.Email, NULL)) IS NOT NULL
solves the issue.
It appears that the optimizer feels it can incorrectly guess that the function will return NULL in the non-match LEFT JOIN case if a plain reference to a.AgentEmail is passed as any parameter. But if the column reference is inside any kind of expression the optimizer ducks out. Wrapping it inside a "dummy", seemingly pointless IFNULL(column, NULL) is thus enough to restore correct behaviour.
I am marking this as the accepted solution because it is by far the simplest workaround, requiring the least code change/complete query rewrite.
However, full credit is due to #slaakso's post here in this topic for analysing the problem. Note that he states that the behaviour has been fixed/altered in MySQL 8 such that this workaround is unnecessary, so it may only be necessary in MySQL 5.7 or earlier.

MySQL Left outer join on Knex not returning correct data

I'm troubleshooting some errors encountered in Knex, and am trying to do a left join with 2 tables, namely, notifications and metadata, with both these tables having the same 2 columns, 'device_id' and 'channel' that I would like to match. However, the below query string doesn't work and returns the following even though there is a metadata record (metadata_id=1) with matching device_id and channel.
I checked that the datatypes are also the same for device_id and channel in both tables. Been stuck for some time and not sure what is wrong here, would be great if someone can help! Also having some problems with translating to Knex for nested queries, but this is probably a small problem.
{
notification_id: 1,
message: 'hello world',
mode: 'email',
metadata_id: null,
unit_conversion: null
}
SELECT `notifications`.`notification_id`, `notifications`.`message`, `notifications`.`mode`,
`metadata`.`metadata_id`, `metadata`.`unit_conversion` from `notifications`
LEFT OUTER JOIN `metadata` ON (`metadata`.`device_id` = `notifications`.`device_id` AND
`metadata`.`channel` = `notifications`.`channel` AND `metadata`.`deleted_at` = null )
WHERE `notifications`.`notification_id` = 1
metadata.deleted_at = null should be replaced with metadata.deleted_at is null, with the excellent explanation by Bohemian here: https://stackoverflow.com/a/9581790/6597774

Having trouble with an IFNULL in a mySQL WHERE clause

Before anyone says, I have searched through for a suitable answer for my issue but cannot find anything specific enough so I thought I'd ask it.
Basically I am trying to select a bunch of data for a report of people who have made loan applications to a website, but there are two different types: unsecured and guarantee. I need to place an IFNULL statement in the WHERE clause so that I ONLY use that clause if a certain other field isn't null.
Here is my statement:
SELECT
la.`lms_loan_application_id`,
la.`created`,
la.`updated`,
la.`loan_amount`,
la.`loan_term`,
la.`loan_document_fee`,
la.`broker_reference`,
la.`broker_sub_reference`,
laa.`first_name`,
laa.`surname`,
laa.`dob`,
laa.`email`,
laa.`mobile_number`,
laaAd.`address_postcode`,
lag.`first_name`,
lag.`surname`,
lag.`dob`,
lag.`email`,
lag.`mobile_number`,
lagAd.`address_postcode`,
lagAd.`housing_status`
FROM
loan_application AS la
JOIN
loan_application_applicant AS laa ON la.`id` = laa.`loan_application`
LEFT JOIN
loan_application_guarantor AS lag ON la.`id` = lag.`loan_application`
JOIN
loan_application_address AS laaAd ON laaAd.`loan_application_applicant` = laa.`id`
LEFT JOIN
loan_application_address AS lagAd ON lagAd.`loan_application_guarantor` = lag.`id`
WHERE
la.`status` = 'signature_given'
AND ! IFNULL(lag.`first_name`,
lag.`status` = 'signature_given')
AND laa.`status` = 'signature_given'
AND ! IFNULL(lag.`first_name`,
lagAd.`current_address` = 1)
AND laaAd.`current_address` = 1
ORDER BY la.`updated` DESC
LIMIT 10000
As you can see, I have attempted to use the IFNULLs (although in a negated way, which I assume works?) but all I get is duplicate row results and not the result set I really want.
Basically, I need to use the where clause "lag.status = 'signature_given" and "lagAd.current_address = 1" ONLY if the lag.first_name field is NOT null (i.e. there is a guarantor name) otherwise the status won't exist, and therefore the results of unsecured loans will not show. Hope I'm explaining this well enough!
In summary, I need to show all loan information, unsecured and guaranteed, and use a negated IFNULL in order to determine when the WHERE clause is to be taken into consideration.
Any help appreciated!
Thank you in advance
Michael
From this MySQLTutorial article:
Notice that you should avoid using the IFNULL function in the WHERE clause, because it degrades the performance of the query. If you want to check if a value is NULL or not, you can use IS NULL or IS NOT NULL in the WHERE clause.
Here is a WHERE clause which implements your logic correctly using IS NULL and IS NOT NULL instead of IFNULL:
WHERE la.`status` = 'signature_given' AND
(lag.`first_name` IS NULL OR
(lag.`first_name` IS NOT NULL AND lag.`status` = 'signature_given')) AND
laa.`status` = 'signature_given' AND
(lag.`first_name` IS NULL OR
(lag.`first_name` IS NOT NULL AND lagAd.`current_address` = 1)) AND
laaAd.`current_address` = 1

Conditional Split fails if value is NULL in SSIS

I am passing result of FULL Outer join to Conditional Split and Filtering Records on the basis of following rules . Basically both tables has same schema and Primarykey values are same.
a. If Primary key of Source is NULL
b. If Primary Key of Destination is NULL
c. If Source and Destination key matches.
It works fine for (a) and (b) but fails for (c)
Source.Id == Destination.Id
and throws exception that condition evaluated as NULL where Boolean was expected. How i can make this work?
Conditional Split gets input from Merge Join and it's a FULL OUTER JOIN as i need FULL OUTER join results here
Your third condition should start with a ISNULL check again before you compare your values. Like the following:
!ISNULL(Source.Id) && !ISNULL(Destination.Id) && Source.Id == Destination.Id
You need to handle every column that can be NULL in your condition.
Since you are comparing Id's, another option would be:
(ISNULL(Source.Id) ? 0 : Source.Id) == (ISNULL(Destination.Id) ? 0 : Destination.Id)
If comparing strings, you can replace the zeroes with blank spaces.
Alternatively you can use the following syntax:
REPLACENULL(Source.Id,0) == REPLACENULL(Destination.Id,0)

Allowing Optional Parameters for MySQL Query

I have a search page that has multiple fields that are used to create a refined search. Every field is optional. I'm trying to start crafting my sql query so that it will work given the proper variables but I'm having trouble.
Here is the SQL query I currently have:
SELECT
indicator.indid,
indicator.indicator,
indtype.indtype,
provider.provider,
report.report,
actor.actor
FROM
actor,
indicator,
indtype,
report,
provider
WHERE
indicator.indtypeid = indtype.indtypeid
AND indicator.actorid = actor.actorid
AND indicator.reportid = report.reportid
AND report.providerid = provider.providerid
AND indicator.indicator LIKE '%$indicator%'
AND indicator.indtypeid = $indtypeid;
Whenever I provide an indicator and an indtypeid, the search works just fine. However, when I leave the indtypeid field blank, and have the variable set to * (as its default value), the query returns no results. I've tried playing with the query manually and it doesn't seem to like the * or a % sign. Basically, if only an indicator is specified and no indtypeid is specified, I want to return all indicators for all indtypeids.
I'm sure I'm missing something minor, but I would appreciate any assistance that could be provided. I may be going about this all wrong in the first place.
Try this instead:
SELECT i.indid, i.indicator, it.indtype,
p.provider, r.report, a.actor
FROM actor a
INNER JOIN indicator i ON a.actorid = i.actorid
INNER JOIN indtype it ON i.indtypeid = it.indtypeid
INNER JOIN report r ON i.reportid = r.reportid
INNER JOIN provider p ON r.providerid = p.providerid
WHERE 1 = 1
AND ($indicator IS NULL OR i.indicator LIKE '%$indicator%')
AND ($indtypeid IS NULL OR i.indtypeid = $indtypeid);
So if you pass a $indicator = NULL, then the first condition AND ($indicator IS NULL OR i.indicator LIKE '%$indicator%') will be ignored since it will resolve to True, and the same thing for the second condition.
I've removed other Where condition and replace them with JOINs, and for WHERE 1 = 1 to make the query work fine in case you pass the two variables $indicator and $indtypeid with NULL values for each, in this case it will return all results since 1 = 1 always true.