finding distinct pairs in sql - mysql

I was trying to learn non-equi joins when I encountered this problem. I have a table pops:
+------------+--------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+------------+--------------+------+-----+---------+-------+
| country | varchar(100) | YES | | NULL | |
| continent | varchar(100) | YES | | NULL | |
| population | bigint(20) | YES | | NULL | |
I was trying to find countries with their population in the vicinity of say, 100.
select distinct
p1.country,
p2.country,
p1.population,
p2.population
from pops p1
inner join pops p2
on p1.population between p2.population - 100 and p2.population + 100
and p1.country <> p2.country
where p2.country <> p1.country
output I got was:
+------------+------------+------------+------------+
| country | country | population | population |
+------------+------------+------------+------------+
| pakistan | india | 99988 | 99999 |
| china | india | 99990 | 99999 |
| bangladesh | japan | 999 | 999 |
| india | pakistan | 99999 | 99988 |
| china | pakistan | 99990 | 99988 |
| japan | bangladesh | 999 | 999 |
| india | china | 99999 | 99990 |
| pakistan | china | 99988 | 99990 |
+------------+------------+------------+------------+
as we can see, I am getting pairs of (india, pakistan) as well as (pakistan, india), which is data-wise the same thing. Is it possible to eliminate one of the records from the pair?

You could decide to always have the lexographically first (or last, for argument's sake) country on the p1 side - use < (or >) instead of <>.
Also, note that your where clause is redundant, since you already have this condition in the on clause of the join:
select p1.country,
p2.country,
p1.population,
p2.population
from pops p1
inner join pops p2
on p1.population between p2.population - 100 and p2.population + 100 and
p1.country < p2.country
-- Here ------------------^

just change the join condition of p1.country <> p2.country to p1.country < p2.country

Related

Problem with renaming a field in sql table

So i have this table:
SELECT * FROM table LIMIT 10;
+----+----------------+------------+----------+----------------+--------+--------+
| sex_id | First name | year of beginning | year of ending | country | | |
+----+----------------+------------+----------+----------------+--------+--------+
| 56| mimic | 1987 | NULL | United Kingdom | Group | NULL |
| 3 | charales glass | 1941 | NULL | United States | Person | Male |
| 33| Grass | 1983 | 2000 | United Kingdom | Group | NULL |
| 67| Mother | 1989 | 2000 | United States | Group | NULL |
| 69| wind of lollie | 1950 | NULL | United States | Person | Male |
+----+----------------+------------+----------+----------------+--------+--------+
‎‎‎‎‎‎‎
‎‎‎‎‎‎‎‎‎‎
‎‎‎
‎‎‎
I t‎‎‎‎hen make the ‎‎‎table smaller to show what i want to rename which is the end year, and i want to change it to NULL.
‎‎‎‎‎ERROR 1366 (HY000): Incorrect integer value 'NULL' for column 'end_year' at row 155
Please remove quotes from 'NULL' in your update statement.
UPDATE table_name SET row_name=NULL WHERE name='Specific name';
NULL not equal varchar2,so you should try end_year=NULL

How to group by columns in a different table

I am trying to write a query to return the sum of totalRxCount that is grouped by zipcode.
I have two tables named fact2 and demographic.
My problem is that in the demographic table there are duplicate rows which affects the sum of totalRxCount.
To avoid duplicates I am wanting to only return results where npiNum is distinct.
Right now I have this working but it is grouping by relId (the primary key).
I cannot figure out a way to group by zipcode since this column and totalRxCount are in separate tables.
When I try this I am getting wrong results since it is counting the duplicate rows.
Here is my query. I am wanting to modify this to return results grouped by zipcode instead of relId.
Any input will be greatly appreciated!
SELECT fact2.relID
, SUM(fact2.`totalRxCount`)
FROM fact2
LEFT
JOIN (
SELECT O1.relId, COUNT(DISTINCT O1.npiNum)
FROM demographic As O1
GROUP BY O1.relId
) AS d1
ON d1.`relId` = fact2.relID
LEFT
JOIN (
SELECT O2.relID, Sum(O2.totalRxCount)
FROM fact2 AS O2
GROUP BY O2.relID
) AS p1
ON p1.relID = d1.relId
WHERE (monthEndDate BETWEEN 201911 AND 202010) GROUP BY fact2.relID;
Results:
+-------+---------------------------+
| relID | SUM(fact2.totalRxCount) |
+-------+---------------------------+
| 2465 | 2 |
+-------+---------------------------+
What I've tried
SELECT zipcode, SUM(fact2.`totalRxCount`)
FROM fact2
INNER JOIN demographic ON demographic.relId=fact2.relID
LEFT JOIN (
SELECT O1.`relId`, COUNT(DISTINCT O1.`npiNum`)
FROM demographic As O1
GROUP BY O1.`relId`
) AS d1
ON d1.`relId` = fact2.`relID`
LEFT JOIN (
SELECT O2.`relID`, Sum(O2.`totalRxCount`)
FROM fact2 AS O2
GROUP BY O2.`relID`
) AS p1
ON p1.`relID` = d1.`relId`
WHERE (`monthEndDate` BETWEEN 201911 AND 202010) GROUP BY zipcode;
This is returning the sum multiplied by number of duplicate rows in demographic.
Results:
+---------+---------------------------+
| zipcode | SUM(fact2.`totalRxCount`) |
+---------+---------------------------+
| 66097 | 4 |
+---------+---------------------------+
^ This should be 2
demographic table:
+-------+---------+------------+------------+-----------+------------+------------------------------------+-------+----------+----------+-----------------+------------+-------+--------------+---------+----------+-----------+--------+-------------+--------+--------+----------------+
| relId | zipcode | providerId | writerType | firstName | middleName | lastName | title | specCode | specDesc | address | city | state | amaNoContact | pdrpInd | pdrpDate | deaNum | amaNum | amaCheckDig | npiNum | terrId | callStatusCode |
+-------+---------+------------+------------+-----------+------------+------------------------------------+-------+----------+----------+-----------------+------------+-------+--------------+---------+----------+-----------+--------+-------------+--------+--------+----------------+
| 2465 | 66097 | | A | | | JEFFERSON COUNTY MEMORIAL HOSPITAL | | | | 408 DELAWARE ST | WINCHESTER | KS | | | | AJ4281096 | | | | 11604 | |
| 2465 | 66097 | | A | | | JEFFERSON COUNTY MEMORIAL HOSPITAL | | | | 408 DELAWARE ST | WINCHESTER | KS | | | | AJ4281096 | | | | 11604 | |
+-------+---------+------------+------------+-----------+------------+------------------------------------+-------+----------+----------+-----------------+------------+-------+--------------+---------+----------+-----------+--------+-------------+--------+--------+----------------+
fact2
+-------+----------+-----------------+-----------+-------------------+----------+------------+------------+--------+------------+--------------+------------+---------------+--------------+-----------+--------------+-------------+-----------+--------------+-------------+
| relID | marketId | marketName | productID | productName | dataType | providerId | writerType | planId | pmtTypeInd | monthEndDate | newRxCount | refillRxCount | totalRxCount | newRxQuan | refillRxQuan | totalRxQuan | newRxCost | refillRxCost | totalRxCost |
+-------+----------+-----------------+-----------+-------------------+----------+------------+------------+--------+------------+--------------+------------+---------------+--------------+-----------+--------------+-------------+-----------+--------------+-------------+
| 2465 | 10871 | GALT PP MONTHLY | 1399451 | ZOLPIDEM TARTRATE | 15 | | A | 900145 | C | 202004 | 1 | 0 | 1 | 30 | 0 | 30 | 139 | 0 | 139 |
| 2465 | 10871 | GALT PP MONTHLY | 1399458 | ESZOPICLONE | 15 | | A | 900145 | C | 202006 | 1 | 0 | 1 | 30 | 0 | 30 | 350 | 0 | 350 |
+-------+----------+-----------------+-----------+-------------------+----------+------------+------------+--------+------------+--------------+------------+---------------+--------------+-----------+--------------+-------------+-----------+--------------+-------------+

MySQL returns bad result

I have question about SELECT FROM WHERE statement, which returns me bad result.
Here is my table called friends:
+----------+-----------+------------+--------+--------+-------+
| lastname | firstname | callprefix | phone | region | zip |
+----------+-----------+------------+--------+--------+-------+
| Lužný | Bob | 602 | 111222 | OL | 79821 |
| Matyáš | Bob | 773 | 123456 | BR | NULL |
| Strouhal | Fido | 300 | 343434 | ZL | 76701 |
| Přikryl | Tom | 581 | 010101 | PL | 72000 |
| Černý | Franta | 777 | 000999 | OL | 79801 |
| Zavadil | Olda | 911 | 111311 | OL | 79604 |
| Berka | Standa | 604 | 111234 | ZL | 72801 |
| Vlcik | BbB | 736 | 555444 | KV | 35210 |
+----------+-----------+------------+--------+--------+-------+
And here is my query.
SELECT * FROM friends WHERE region <= 'z';
I would expect that the rows with region ZL should be present, but they are not. Can you please tell me why?
Result is:
+----------+-----------+------------+--------+--------+-------+
| lastname | firstname | callprefix | phone | region | zip |
+----------+-----------+------------+--------+--------+-------+
| Lužný | Bob | 602 | 111222 | OL | 79821 |
| Matyáš | Bob | 773 | 123456 | BR | NULL |
| Přikryl | Tom | 581 | 010101 | PL | 72000 |
| Černý | Franta | 777 | 000999 | OL | 79801 |
| Zavadil | Olda | 911 | 111311 | OL | 79604 |
| Vlcik | BbB | 736 | 555444 | KV | 35210 |
+----------+-----------+------------+--------+--------+-------+
When I try this query:
SELECT * FROM friends WHERE region >= 'z';
the result contains both rows with region = 'ZL'
????
Thank you!
Because "ZL" is greater than "Z." Z is just one character so will only return values less that Z or with the value of Z. What are you trying to achieve with this query?
Can you please tell me why?
If you add a record where region is Z, and sorted those rows alphabetically by region, would you expect ZL to come before or after Z? Obviously it would come after, so it does not meet your criteria.
If you want to only consider the first character, then add that to your criteria:
SELECT * FROM friends WHERE LEFT(region,1) <= 'Z';
I would also make Z explicitly a capital letter in case your database settings make it a case-sensitive search.
Have you tried
SELECT * FROM friends WHERE region <= 'zl';?
From the computer's perspective, 'z' < 'zl'

Returning Only First Distinct Value In Sorted MySql Join Query

I have two MySql tables, once for "Locations" and one for "Images". I need to get a list of the most recent Image taken at a particular set of Locations (which is a comma-delimited list), but I only want to return the record for the most recent Image and I've been struggling mightily with getting the right results so far.
So, I have:
Locations:
+---------------------------------------------+
| ID | Name |
|----|----------------------------------------|
| 1 | Indiana |
| 2 | Ohio |
| 3 | Illinois |
+---------------------------------------------+
Images:
+---------------------------------------------+
| ID | User | Location | Date |
|----|-------|-----------|--------------------|
| 1 | Ray | 1 | 2012-06-22 |
| 2 | Robert| 3 | 2011-09-18 |
| 3 | Marie | 1 | 2012-10-01 |
| 4 | Frank | 2 | 2010-12-11 |
| 5 | Debra | 1 | 2008-02-02 |
+---------------------------------------------+
So, right now I have the following:
SELECT Locations.Name, Images.Date, Images.User
FROM Locations INNER JOIN Images ON Locations.ID = Images.Location
WHERE Locations.ID IN ('1','3')
ORDER BY Images.Date DESC
Which returns:
+---------------------------------------------+
| Name | Date | User |
|-------------|-------------|-----------------|
| Indiana | 2012-10-01 | Marie |
| Indiana | 2012-06-22 | Ray |
| Illinois | 2011-09-18 | Robert |
| Indiana | 2008-02-02 | Debra |
+---------------------------------------------+
My question is, how can I get it so that the result returns only the first record with a distinct Location.Name value? So the final, correct result table would look like:
+---------------------------------------------+
| Name | Date | User |
|-------------|-------------|-----------------|
| Indiana | 2012-10-01 | Marie |
| Illinois | 2011-09-18 | Robert |
+---------------------------------------------+
Thanks a lot!
SImply uSe group by::
Select tempTable.Name, tempTable.Date, tempTable.User from
(
SELECT Locations.Name, Images.Date, Images.User, Locations.ID as locationID
FROM Locations
INNER JOIN Images ON Locations.ID = Images.Location
WHERE Locations.ID IN ('1','3')
ORDER BY Images.Date DESC
) as tempTable GROUP BY tempTable.locationID

MySQL - No result when joining tables using a NULL value

Similar question here but this is slightly different...
I have two tables that I want to join:
location
---------------------------
| id | city | state_id |
---------------------------
| 1 | Denver | 6 |
| 2 | Phoenix | 2 |
| 3 | Seattle | NULL |
---------------------------
state
-------------------
| id | name |
-------------------
| 1 | Alabama |
| 2 | Alaska |
| 3 | Arizona |
| 4 | Arkansas |
| 5 | California |
| 6 | Colorado |
-------------------
SELECT
location.id,
location.city,
state.name
FROM
location
JOIN
state ON state.id = location.state_id;
However, in the case where location.state_id happens to be NULL (perhaps the person inputting the data forgot to select a state), the query would not return a result, but that doesn't mean the location doesn't exist.
How do I get around this problem and somehow display all the locations, even though the state_id might be NULL ?
Use a LEFT OUTER JOIN
SELECT
location.id,
location.city,
state.name
FROM
location
LEFT OUTER JOIN
state ON state.id = location.state_id;