I'm importing CSV files with missing data into a MariDB table. I need to find all codes that don't have a corresponding place = 2.
Table cityX
| id | code | place | value | description | subcode |
| 1 | 001x | 1 | 6.00 | unique str | A |
| 2 | 002x | 1 | 2.23 | diff string | B |
| 3 | 003x | 1 | 2.23 | another str | B |
Every code in the table must have a duplicate row with place = 1 and place = 2
| id | code | place | value | description | subcode |
| 1 | 001x | 1 | 6.00 | unique str | A |
| 2 | 001x | 2 | 6.00 | unique str | A |
I've used variations of select ... except statements to isolate the codes with varying amount of erroneous fields.
Using SELECT [code] FROM cityX WHERE place = '1' EXCEPT SELECT [code] FROM cityX where place = '2', creating a temporary table and joining the remaining place, value, description, and subcode fields to retrieve missing codes. I'm retrieving most of the missing records, but am introducing duplicates as well.
How can I properly select and insert rows missing a place = 2
This solution avoids EXCEPT which doesn't work in every RDBMS (not sure about mysql)
SELECT CODES.code,
CODE_W_1.place AS place_1,
CODE_W_2.place AS place_2
FROM (SELECT code
FROM cityx
GROUP BY code) AS CODES
LEFT OUTER JOIN (SELECT code,
place
FROM cityx
WHERE place = 1
GROUP BY code) AS CODE_W_1
ON CODES.code = CODE_W_1.code
LEFT OUTER JOIN (SELECT code,
place
FROM cityx
WHERE place = 2
GROUP BY code) AS CODE_W_2
ON CODES.code = CODE_W_2.code
WHERE code_w_1 IS NULL
OR CODE_W_2.code IS NULL
I don't have access to mysql to test this, but I got this from Rasgo which automatically writes SQL.
We can use selects which test whether the other values exists. We can either use the queries alone to check for unmatched values, or in an insert to add the missing values.
create table cityX (
id int primary key not null auto_increment,
code char(5),
place int );
insert into cityX (code, place) values
('001x',1),('001x',2),('002x',1),('003x',2);
select * from cityX
order by code, place;
id | code | place
-: | :--- | ----:
1 | 001x | 1
2 | 001x | 2
3 | 002x | 1
4 | 003x | 2
insert into cityX (code, place)
select x.code,1 from cityX x
where place = 2
and not exists
(select id from cityX c
where c.code = x.code
and place = 1);
insert into cityX (code, place)
select x.code,2 from cityX x
where place = 1
and not exists
(select id from cityX c
where c.code = x.code
and place = 2);
select * from cityX
order by code, place;
id | code | place
-: | :--- | ----:
1 | 001x | 1
2 | 001x | 2
3 | 002x | 1
6 | 002x | 2
5 | 003x | 1
4 | 003x | 2
db<>fiddle here
Related
I have a table like this
|num|id|name|prj|
| 1 | 1|abc | 1 |
| 2 | 1|efg | 1 |
| 3 | 1|cde | 1 |
| 4 | 2|zzz | 1 |
I want to run a query like this:
SELECT * FROM table WHERE prj=1 ORDER BY name
but printing out repeated values only once. I want to keep all the rows and I would like to do this at database level and not on the presentation layer (I know how to do it in php).
Desired result is
|num|id|name|prj|
| 1 | 1|abc | 1 |
| 3 | |cde | 1 |
| 2 | |efg | 1 |
| 4 | 2|zzz | 1 |
any hint on where to start from to build that query?
Use a session variable to test if the previous ID is the same as the current ID:
SELECT num, IF(#lastid = id, '', #lastid := id) AS id, name, prj
FROM table
CROSS JOIN (SELECT #lastid := null) x
ORDER BY table.id, name
DEMO
Note that you need to qualify table.id, because ORDER BY defaults to using the alias from the SELECT list if it's the same as a table column, and that would order the empty fields first.
I have a two tables.
work:
+----+----------+
| id | position |
+----+----------+
| 1 | 1 |
| 2 | 2 |
+----+----------+
content:
+----+---------+------+-------------+
| id | work_id | name | translation |
+----+---------+------+-------------+
| 1 | 1 | Kot | 1 |
| 2 | 1 | Cat | 2 |
| 3 | 2 | Ptak | 1 |
| 4 | 2 | Bird | 2 |
| 5 | 2 | Ssss | 3 |
+----+---------+------+-------------+
I want to get result like this:
+----+------+----------+
| id | name | sortName |
+----+------+----------+
| 1 | Kot | NULL |
| 1 | Cat | NULL |
| 2 | Ptak | Ssss |
| 2 | Bird | Ssss |
+----+------+----------+
My not working query is here:
select
w.id,
c.name,
cSort.name as sortName
from
work w
LEFT JOIN
content c
ON
(w.id=c.work_id)
LEFT JOIN
content cSort
ON
(w.id=cSort.work_id)
WHERE
c.translation IN(1,2) AND
cSort.translation=3
ORDER BY
sortName
I want to get for each work at least one translation and secound if exist (translation=1 always exist). And for every row I want special column with translation used to sort. But Not always this translation exist for work.id. In this example I want to sort work by translation=3.
Sorry for my not fluent english. Any ideas?
Best regards
/*
create table work ( id int, position int);
insert into work values
( 1 , 1 ),
( 2 , 2 );
create table content(id int, work_id int, name varchar(4), translation int);
insert into content values
( 1 , 1 , 'Kot' , 1),
( 2 , 1 , 'Cat' , 2),
( 3 , 2 , 'Ptak' , 1),
( 4 , 2 , 'Bird' , 2),
( 5 , 2 , 'Ssss' , 3);
*/
select w.id,c.name,(select c.name from content c where c.work_id = w.id and c.translation = 3) sortname
from work w
join content c on w.id = c.work_id
where c.translation <> 3;
result
+------+------+----------+
| id | name | sortname |
+------+------+----------+
| 1 | Kot | NULL |
| 1 | Cat | NULL |
| 2 | Ptak | Ssss |
| 2 | Bird | Ssss |
+------+------+----------+
So translation is also a work_id and you consider translation = 3 a translation in your example and translation <> 3 an original. You want to join each original record with every translation record where the latter's work_id matches the former's translation.
I think you are simply confusing IDs here. It should be ON (w.translation = cSort.work_id).
Another way to write the query:
select o.work_id as id, o.name, t.name as sortname
from (select * from content where translation <> 3) o
left join (select * from content where translation = 3) t
on t.work_id = o.translation
order by t.name;
There seems to be no need to join table work.
I'd like to add that the table design is a bit confusing. Somehow it is not clear from it what is a translation for what. In your example you interpret translation 3 as a translation for the non-three records, but this is just an example as you say. I don't find this readable.
UPDATE: In order to sort your results by work.position, you can join that table or use a subquery instead. Here is the order by clause for the latter:
order by (select position from work w where w.id = o.work_id);
+------+---------+--------+---------+---------+---------+
| id | user_id | obj_id | created | applied | content |
+------+---------+--------+---------+---------+---------+
| 1 | 1 | 1 | 1 | 1 | ... |
| 2 | 1 | 2 | 1 | 1 | ... |
| 3 | 1 | 1 | 1 | 2 | ... |
| 4 | 1 | 2 | 2 | 2 | ... |
| 5 | 2 | 1 | 1 | 1 | ... |
| 6 | 2 | 2 | 1 | 1 | ... |
+------+---------+--------+---------+---------+---------+
I have a table similar to the one above. id, user_id and obj_id are foreign keys; created and applied are timestamps stored as integers. I need to get the entire row, grouped by user_id and obj_id, with the maximum value of applied. If two rows have the same applied value, I need to favour the maximum value of created. So for the above data, my desired output is:
+------+---------+--------+---------+---------+---------+
| id | user_id | obj_id | created | applied | content |
+------+---------+--------+---------+---------+---------+
| 1 | 1 | 1 | 1 | 1 | ... |
| 4 | 1 | 2 | 2 | 2 | ... |
| 5 | 2 | 1 | 1 | 1 | ... |
| 6 | 2 | 2 | 1 | 1 | ... |
+------+---------+--------+---------+---------+---------+
My current solution is to get everything ordered by applied then created:
select * from data order by applied desc created desc;
and sort things out in the code, but this table gets pretty big and I'd like an SQL solution that just gets the data I need.
select *
from my_table
where id in (
/* inner subquery b */
select max(id)
from my_table where
(user_id, obj_id, applied, created) in (
/* inner subquery A */
select user_id, obj_id, max(applied), max(created)
from my_table
group by user_id, obj_id
)
);
Then inner subquery A return the (distinct) rows having user_id, obj_id, max(applied), max(created). Using these with in clause the subquery B retrive a list of single ID each realated the a row with a proper value of user_id, obj_id, max(applied), max(created). so you have a collection of valid id for getting your result.
The main select use these ID for select the result you need.
Thanks to Mark Heintz in the comments, this answer got me to where I need to be.
SELECT
data.id,
data.user_id,
data.obj_id,
data.created,
data.applied,
data.content
FROM data
LEFT JOIN data next_max_applied ON
next_max_applied.user_id = data.user_id AND
next_max_applied.obj_id = data.obj_id AND (
next_max_applied.applied > data.applied OR (
next_max_applied.applied = data.applied AND
next_max_applied.created > data.created
)
)
WHERE next_max_applied.applied IS NULL
GROUP BY user_id, obj_id;
Go read the answer for details on how it works; the left join tries to find a more recently applied row for the same user and object. If there isn't one, it will find a row applied at the same time, but created more recently.
The above means that any row without a more recent row to replace it will have a next_max_applied.applied value of null. These rows are filtered for by the IS NULL clause.
Finally, the group by clause handles any rows that have identical user, object, applied and created columns.
I have a table having following structure:
| pid | email | email_type |
| 1 | x | 1 |
| 1 | y | 2 |
| 1 | z | 3 |
| 2 | ab | 1 |
| 3 | cd | 2 |
Now I want my result for the pid parameter in format for email_type[1 & 2] only:
Case pid=1
| pid | email_p | email_w |
| 1 | x | y |
Case pid=2
| 2 | ab | NULL |
Case pid=3
| 3 | NULL | cd |
Here email_p represents email_type=1 & email_w represents email_type=2
I am using following query, and its working fine except for a case pid=2. Case I & Case III are successfully fetched. Please provide some solution with good explanation(if possible).
SELECT `e`.`pid`, `e`.`email` AS `email_p`, `e1`.`email` AS `email_w` FROM `table1` AS `e` LEFT JOIN `table1` AS `e1` ON e.pid=e1.pid AND e1.email_type=2 WHERE (e.pid IN (1) AND e.email_type=1)
It fails when e.pid = 2 and it returns empty result set & please provide solution e.pid IN (1,2,3) holds good for required format.#MYSQL
Try This
SELECT pid,
MAX(CASE WHEN email_type=1 THEN email END ) as email_p ,
MAX(CASE WHEN email_type=2 THEN email END ) as email_w
FROM tableName
WHERE email_type IN (1,2)
GROUP BY pid
SQL Fiddle DEMO
I have the following table:
---------------------------
| id | capital_id | value |
---------------------------
| 1 | 1 | a |
---------------------------
| 2 | 2 | b |
---------------------------
| 3 | 2 | c |
---------------------------
| 4 | 2 | d |
---------------------------
| 5 | 3 | b |
---------------------------
| 6 | 3 | e |
---------------------------
| 7 | 4 | f |
---------------------------
I need to select only distinct capital_id's, but different from one that has a value given.
To be more clear, I'll provide an example: If I have the record with id=5, I need to fetch all distinct capital_id's, different than 3 and with the value different from 'b' (so capital_id's to be fetched are: 1 and 4).
I managed to write the query like SELECT id FROM table WHERE capital_id != $capital_id AND value != $value, but duplicate capital_id's are fetched this way. I tried to add a GROUP BY capital_id, but then capital_id=2 is also fetched, although one of its values is 'b'.
How can I solve this problem?
SELECT capital_id
FROM tableName
WHERE capital_id <> $capital_id
GROUP BY 1
HAVING SUM(value = $value) = 0
Try this.
SELECT DISTINCT id FROM table WHERE capital_id != $capital_id AND value != $value
SELECT capital_id
FROM tableName t
WHERE capital_id != $capital_id
AND NOT EXISTS(SELECT *
FROM tableName
WHERE capital_id = t.capital_id
AND value = $value)
GROUP BY capital_id