AWS GLUE SQL join with single row from right table - mysql

Im trying to join two datasets in AWS glue
Table 1(alias af):
id
data
created
1
string 1
2020-02-10
2
string 2
2020-02-11
3
string 3
2020-02-12
Table 2 (alias mp):
id
data
data2
created
foreign_key
1
string 1
json string
2020-02-10
2
2
string 2
json string
2020-02-11
3
3
string 3
json string
2020-02-12
3
What i want to do is get all rows from table 1 and select the first row from table 2 that matches the foreign key.
This is what I have currently after going through a few questions i found that i need to wrap the query with an aggregate function to let spark know that only 1 element will match this subquery.
select af.id,af.data
(select first(mp.data)
from mp
where af.id= mp.foreign_key
) as alias1,
(select first(mp.data2)
from mp
where af.id= mp.foreign_key
) as alias2
from af
having alias 1 is not null and alias2 is not null
But this is giving me the following error:
ParseException: mismatched input 'first' expecting {')', ',', '-'}(line 3, pos 15)
Any help will be appreciated!

Ive found a solution that works for my use case. Comment above was right the SQL was funky before.
Select af.*, mp.*
from af join
(select mp.*, row_number() over (partition by mp.fid order by mp.created_at) as seqnum
from mp
) mp
on af.id= mp.fid and seqnum = 1;

Related

How to get distinct id based on the order of increasing times of ID in MYSQL

How to get the distinct id from a group of id based on the order of increasing number of times it present.
For Example , input: 3,1,1,2,2,2
Here id 2 present 3 times , id 1 present 2 times and id 3 present 1 time..
here is my output 2,1,3
How to get these with a single query using mysql
select distinct id, COUNT(id) from your_table
group by id
order by COUNT(id)
heres a simple query with the count as well if you want to check its in the correct order.
At first, we need to analyse how you have got this input:
3,1,1,2,2,2
The CSV input can be pre-filtered, if it is through:
User Input
Query Output
If it was a User Input, then there's no way MySQL can directly access the value, unless it is stored as data. In that case, you will be having some kind of PHP or other programming language that sends the data to MySQL. So, assuming it for PHP, what I would do is:
<?php
$csv = "3,1,1,2,2,2";
$arr = explode(",", $csv);
$arr = array_unique($arr);
?>
Now you will have unique values.
If it was a query output, you just need to use DISTINCT keyword.
SELECT DISTINCT(`id`) FROM `table` WHERE `SomeCondition`='Value';
You can also try by using GROUP BY, but using DISTINCT is much faster IMHO. (What's faster, SELECT DISTINCT or GROUP BY in MySQL?)
Suppose we have 2 tables with us:
1) student: Fields are as follows:
a) id: INTEGER AUTO INCREMENT PRIMARY KEY
b) name: VARCHAR
Sample Data:
student
id | name
----------
1 | A
2 | B
3 | C
2) marks: Fields are as follows:
a) id: INTEGER AUTO INCREMENT PRIMARY KEY
b) sid: INTEGER FOREIGN KEY (refers to id field from student table)
c) subject: VARCHAR
d) marks: INTEGER
Sample Data:
marks:
id | sid | subject | marks
--------------------------
1 | 1 | s1 | 40
2 | 2 | s2 | 50
3 | 2 | s1 | 60
4 | 2 | s2 | 70
5 | 3 | s1 | 80
Use below query to get distinct student id's with referring records in descending order:
SELECT `student`.`id`, COUNT(*) AS `total` FROM `student` INNER JOIN `marks` ON (`student`.`id` = `marks`.`sid`) WHERE 1 GROUP BY `student`.`id` ORDER BY `total` DESC
You can use group by to get unique ids.
SQL Query:
select id from table group by id;

How to get unique latest updated database?

I was trying to retrieve where i want to get data based on unique id and But the latest one.
ie,
id customer_name uniqueId
1 x 1123
2 y 1123
3 z 1124
4 m 1125
5 n 1125
expected output after query:
id customer_name uniqueId
1 y 1123
2 z 1124
3 n 1125
I used the following statements but couldn't get the expected answer:
$customers = DB::table('customers')
->select('uniqueId','customer_name','id','created_at')
->orderBy('created_at','desc')
->groupBy('uniqueId')
->get();
Can anyone suggest me the right answer?
I think it should be like:
SELECT [id],[customer_name],[uniqueId]
FROM [Table name]
WHERE [id]= 2 OR ID = 3 OR ID = 5
Or with the IN clause
where id in (id1,id2.........long list)
If you really want to change the id to (1,2,3) you can always export this data somewhere else like new table i.e. and do the update function.
i.e.
UPDATE Your Table Here
SET id = 1
WHERE [unique_id]=1123
etc.
Cheers,
Kacper
I solved your problems. It would be some complex query. But it solved your problems. It is raw query. I run raw query by laravel query builder. If you get more simple query, then let us know.
$sql = "
SELECT p.`id`,p.`customer_name`,p.`uniqueId`
FROM
(
SELECT MAX(created_at) AS created_at
FROM `customers`
GROUP BY `uniqueId`
) t
LEFT JOIN `customers` p ON t.created_at = p.`created_at`
"
$customers = DB::select($sql);

MySQL select query with AND condition on same columns of same table

I have a table like this
itemid | propertyname | propertyvalue
___________|______________|_______________
1 | point | 12
1 | age | 10
2 | point | 15
2 | age | 11
3 | point | 9
3 | age | 10
4 | point | 13
4 | age | 11
I need a query to select all items where age greater than 10 and point less than 12.
I tried
`select itemid from table where (propertyname="point" and propertyvalue < 12)
and (propertyname="age" and propertyvalue >10)`
It gives no results. How can I make it work?
you can use an inner join
SELECT
a.itemid
FROM
yourTable a
INNER JOIN
yourTable b
ON
a.itemid=b.itemid
AND a.propertyname='point'
AND b.propertyname='age'
WHERE
a.propertyvalue<12
AND b.propertyvalue>10
ok so in table a youre lookin for all items with the name point and a value smaller 12 and in table b youre looking for all items with the name age and a value greater 10. Then you only have to look for items, which are in both tables. For this you connect the two tables over the itemid. To connect tables you use the join. Hope this will help you to understand. If not ask again :)
To join a table to itself in the same query you can include the table twice in the FROM clause, giving it a different alias each time. Then you simply proceed with building your query as if you were dealing with two separate tables that just happen to contain exactly the same data.
In the query below the table example is aliased as a and b:
SELECT a.itemid
FROM example a, example b
WHERE a.itemid = b.itemid
AND a.propertyname = 'point'
AND a.propertyvalue < 12
AND b.propertyname = 'age'
AND b.propertyname > 10
Try It:
SELECT itemid FROM test_table WHERE propertyname="point" AND propertyvalue < 12 AND itemid IN(SELECT itemid FROM test_table WHERE propertyname="age" AND propertyvalue >10)
http://sqlfiddle.com/#!9/4eafc6/1
PLs Try this
select itemid from table where (propertyname="point" and propertyvalue < 12)
or (propertyname="age" and propertyvalue >10);
Here's one idea...
SELECT item_id
, MAX(CASE WHEN propertyname = 'point' THEN propertyvalue END point
, MAX(CASE WHEN propertyname = 'age' THEN propertyvalue END age
FROM a_table
GROUP
BY item_id
HAVING age+0 > 10
AND point+0 < 12;
You can use an inner join. Meaning, it's like you're going to work with 2 tables: the first one you're going to select the name="age" and val>10, and the second one is where you're going to select name="point" and val<12.
It's like you're creating an instance of your table that doesn't really exist. It's just going to help you extract the data you need at the same time.

Select From table name obtained dynamically from the query

I have 3 Tables
campaign1 (TABLE)
id campaign_details
1 'some detail'
campaign2 (TABLE)
id campaign_details
1 'some other detail'
campaign_list (TABLE)
id campaign_table_name
1 'campaign1'
2 'campaign2'
Campaign list table contains the table name of the two tables described above. I want to Select from the Campaign List table and get the record count using the table name i get from this select
For eg.
using select i get campaign1(Table name). Then i run select query on campaign1 to count number of records.
What i'm doing right now is .
-Select from campign_list
-loop through all campaign_table_names and run select query individually
Is there a way to do this using a single query
something like this
select campaign_name,(SELECT COUNT(*) FROM c.campaign_name) as campcount from campaign_list c
SQLFiddle : http://sqlfiddle.com/#!9/b766d/2
It's not possible inside a single query to build it dynamically but it's possible to cheat. Especially if there are only two linked tables.
I've listed two options
left outer join both tables
select campaign_name,
coalesce(c1.campaign_details, c2.campaign_details)
from campaign_list c
left join campaign1 c1 using (id)
left join campaign2 c2 using (id);
union all two different selects
select campaign_name,
campaign_details
from campaign_list c
join campaign1 c1 using (id)
union all
select campaign_name,
campaign_details
from campaign_list c
join campaign2 c2 using (id);
sqlfiddle
Combine your campaign tables to 1 table and add an column named 'type' (int).
campaign_items tables:
item_id item_details item_type
1 'some detail' 1
2 'some detail' 1
3 'some other detail' 2
4 'some other detail' 2
campaign_lists table
campaign_id campaign_name
1 'campaign1'
2 'campaign2'
Then you can use the following select statement:
SELECT campaign_name, (SELECT COUNT(*) FROM campaign_items WHERE item_type = campaign_id) as campaign_count
FROM campaign_lists
Oops, writing took me so long that you got this answered by Colin Raaijmakers already. Well, I'll post my answer anyway in spite of being more or less the same answer. Maybe my elaboration helps you see the problem.
Your problem stems from a bad database design. A database is made to order data and its relations. A CD database holds albums, songs, artists, etc. A business database may hold items, warehouses, sales and so on. Your database holds table names. [... time for thinking :-) ]
(When writing a DBMS you would want to store table names, column names, constraints etc., but I guess I am right supposing that you are not writing a new DBMS.)
So create tables that deal with your actual data. E.g.:
campain_type (id_campain_type, description, ...)
campain (id_campain, id_campain_type, campain_date, ...)
campain_type
id_campain_type description
1 Type A
2 Type B
3 Type C
campain
id_campain id_campain_type date
33 1 2015-06-03
85 2 2015-10-23
97 2 2015-12-01
query
select
ct.description,
(select count(*) from campain c where c.id_campain_type = ct.id_campain_type) as cnt
from campain_type ct;
result
description cnt
Type A 1
Type B 2
Type C 0

common values b/w fields

This table lists user and item id's
user_id item_id
1 1
1 2
1 3
2 1
2 3
3 1
3 4
3 3
How can I run a query on this table to list all the items that are common between given users.
My guess is, this will need a self join, but I'm not sure.
i am trying this quering but it's returning an error
SELECT *
FROM recs 1
JOIN recs 2 ON 2.user_id='2' AND 2.item_id=1.item_id
WHERE 1.user_id='1'
Try using alias names that start in a letter:
SELECT *
FROM recs r1
JOIN recs r2 ON r2.user_id='2' AND r2.item_id=r1.item_id
WHERE r1.user_id='1'
This returns
user_id item_id
------- -------
1 1
1 3
for your data. Demo on sqlfiddle.
Note: I kept single quotes in the query, because I assume that both IDs in your table are of character type. If that is not the case, remove single quotes around user ID values '1' and '2'.
I want it for n number of users ... a I want the query to return all item_id's that are common among the users
SELECT DISTINCT(r1.item_id)
FROM recs r1
WHERE EXISTS (
SELECT *
FROM recs r2
WHERE r2.item_id=r1.item_id
AND r1.user_id <> r2.user_id
)
Demo #2.