Assume I have the following tables:
tableA
a_name | age | country
Jordan | 5 | Germany
Jordan | 6 | Spain
Molly | 6 | Spain
Paris | 7 | France
John | 7 | Saudi Arabia
John | 5 | Saudi Arabia
John | 6 | Spain
tableB
id (auto increment primary key)
| age | country | group_num (initially null)
1 | 5 | Germany |
2 | 6 | Spain |
3 | 7 | France |
4 | 7 | Spain |
5 | 8 | France |
6 | 9 | France |
7 | 2 | Mexico |
8 | 7 | Saudi Arabia |
9 | 5 | Saudi Arabia |
I want to be able to do some kind of select/update where I am able to get the following values for the "group_num" column:
tableB
id (auto increment primary key)
| age | country | group_num
1 | 5 | Germany | 1
2 | 6 | Spain | 1
3 | 7 | France | 1
4 | 7 | Spain |
5 | 7 | France | 2
6 | 9 | France |
7 | 2 | Mexico |
8 | 7 | Saudi Arabia | 1
9 | 5 | Saudi Arabia | 1
group_num is assigned based on the criteria of:
1) Places person "a_name" went.
2) Whether other people visited that same country. (regardless of age).
The reason why id's 1,2,3,8,9 all have the same groupId is because Jordan, Molly, and Paris all happen to be somehow linked because of the above two criteria. (they all went to spain) and other countries, i.e. Germany was visited by Jordan who also visited spain, so it has the same group_num. Saudi Arabia was visited by John, who also visited spain, so it has the same group_num.
is there some SQL query or queries (may or may not involve creation of other "complementary" tables to get to the desired result shown above? (i.e. it is okay if group_num should first to be filled with auto_incrementing values like the "id", then updated later if it is necessary. (it is okay to have non-null values for the other value fields currently shown as "(empty)"
Cursors/iteration is very slow... The following are the steps I would perform to fill out those values, very slow process using cursors, if I can get rid of this it would be great:
For tableA, we see that Jordan visited Germany at age 5. (Group_Num in tableB for [5,Germany] updated to 1).
Jordan visits Spain at age 6. (Group Num for [6,Spain] updated to 1 to show its the same grouping as the same guy Jordan visited Spain)
Molly visits Spain at age 6 (group_num for [6,Spain] updated to 1 since even though its a different person, the same age/country pair was hit)
Paris visited France at age 7 (group_num in tableB updated to 2 since she is a different person, visited a completely different country, regardless of age.
John visits Saudi Arabia at age 7 (group_num for [7,Saudi Arabia] in tableB updated to 3 for age+country pair)
John visits Saudi Arabia at Age 5 (group_num for [5,Saudi Arabia] in tableB updated to 3 for age+country pair since its still John)
John visits Spain at age 6 (group_num for [6,Spain] is already 1.. Jordan visited there before, there may be some grouping... so group_num for all the places John visited [6, Spain], [5, Saudi Arabia], and [7,Saudi Arabia] are all updated to 1
You will need an iterative approach which will be based on each new item added to Table1, if you execute the following statements for each such item it will be fast and efficient:
Here is SQLFiddle for state of the db just before inserting the last record in Table 1.
BTW: Your example is not entirely consistent with your description , i assume you signed France 7 as group 1 by mistake, since Paris has no relation to no one in group 1.
Notice the selects that i'm executing:
The first one searched for the group num of my previous places i have visited (this is my disjoint group , e.g. group num 3).
The second is searches if there is a disjoint group that the inserted record may be related to, by searching group num for spain and age 6.
After finding out that you have two disjoint sets that becomes joined as a result of newly inserted record , you may that UPDATE all the group num previously assigned as the second group number to the first one, in such way:
UPDATE Table2 set group_num = 1 where group_num = 3
So i have not used any cursors , but this update is per insert for Table 1.
#
Damascusi you can see if tiggers can work instead of cursors. Triggers are faster than cursors if only you could update group_num on the fly as and when the data is inserted into Table A.
Related
My MySQL knowledge is a bit shaky. I have a table with (among others) the following columns/values:
ID | importID | distID | email | street | city
-----------------------------------------------------------
25 | 5 | 2 | abc#d.com | Main Road | London
-----------------------------------------------------------
26 | 5 | 2 | mno#e.com | Oak Alley | York
-----------------------------------------------------------
27 | 5 | 2 | pqr#s.com | Tar Pits | London
-----------------------------------------------------------
28 | 5 | 2 | xyz#a.com | Fleet Street | London
-----------------------------------------------------------
...
-----------------------------------------------------------
99 | -1 | 2 | abc#d.com | New Street | Exeter
I do some checks when new rows are inserted: validate email addresses, find doublets with different dist(ributor)ID etc.
One of the tasks is "update existing rows with data of the freshly imported row when column "email" is identical" (yes, there can be multiple rows with identical email addresses).
At the time this task is performed, the importID of the currently inserted rows is always -1. I tried aliasing with all kinds of variations of
UPDATE table orig table dup
SET orig.street = dup.street, orig.city = dup.city
WHERE orig.email = dup.email
or joining with numerous variations of
UPDATE table orig
JOIN
(SELECT email FROM table
WHERE importID != -1) dup
ON orig.email = dup.email
SET orig.street = dup.street, orig.city = dup.city
What is my mistake?
Don't do it that way.
Store the duplicate email information in a separate table. That table is to contain records that are awaiting analysis and confirmation; do not try to include them in the main table.
This extra table can have multiple rows with the same email, but the main table must not.
The person processing the pending changes would use a SELECT into both tables (either two Selects or a Union), make a decision, and poke a button. One button would say to toss the new info; one would say to replace; etc.
And, as Tim suggests, have a TIMESTAMP in each table. This will assist the user in making changes, especially if there are multiple pending changes.
Given that it would be better to change the approach, you could code your requirement as follow:
UPDATE mytab INNER JOIN mytab AS newrec ON mytab.email=newrec.email AND newrec.importID=-1
SET mytab.street=newrec.street, mytab.city=newrec.city;
Example
Before
ID importID distID email street city
25 5 2 abc#d.com Main Road London
26 5 2 mno#e.com Oak Alley York
27 5 2 pqr#s.com Tar Pits London
28 5 2 xyz#a.com Fleet Street London
99 -1 2 abc#d.com New Street Exeter
100 -1 2 pqr#s.com foo bar
After
ID importID distID email street city
25 5 2 abc#d.com New Street Exeter
26 5 2 mno#e.com Oak Alley York
27 5 2 pqr#s.com foo bar
28 5 2 xyz#a.com Fleet Street London
99 -1 2 abc#d.com New Street Exeter
100 -1 2 pqr#s.com foo bar
I am new to MySQL I am facing a wall. I was looking for solution but could not find anything which would match my case. I understand how joins working. My question is: Is it possible to do a SELECT based on result from different SELECT?
For clarifying I have 2 tables and 1 view:
report_type table:
id | name | view_name
-------------------------------
1 | citizens | citizens_report
report_filter table:
id | name | filter_column
-------------------------------
1 | City | city
2 | Nationality | nationality
citizens_report view:
id | city | nationality
-------------------------------
1 | Boston | American
2 | London | British
3 | London | Spanish
4 | Paris | French
5 | Paris | French
6 | Boston | German
7 | New York | American
For raportId = 1 I need to look for dynamic view based on result from: report_type table which will be citizens_report - so here is first step where I need to build query with result of it and than I need to create unions of filters based on citizens_report view.
Expected result:
filter | option
----------------------------
city | Boston
city | London
city | Paris
city | New York
nationality | French
nationality | German
nationality | American
nationality | British
nationality | Spanish
I doesn't have to be in any specific order (might be ORDER BY ASC). In any language I can create map where key is equal filter and option is added to array.
Each step I can do with separate query and in any programming language I can call for next one, but could it be done in one query?
Thanks!
I have the following two tables
User
UserID UserName UserCountry
1 User1 India
2 User2 India
3 User3 India
4 User4 China
5 User5 China
6 User6 Brazil
7 User7 Brazil
8 User8 USA
9 User9 USA
10 User10 USA
Status
UserID UserStatus
1 Active
2 Active
3 Inactive
4 Inactive
5 Dormant
6 Dormant
7 Active
8 Inactive
9 Active
10 Active
I want to query these tables and calculate the percentage of active users arranged by country. The output in this example should be
Country Percentage
India 66.66%
China 0.00%
Brazil 50.00%
USA 66.66%
Using the queries below, I am able to extract numerator and the denominator of the percentage formula separately but dont understand how to proceed from here. It will be best if someone can suggest how to extract the desired output in a single query.
#active_user = Status.where(UserStatus: “Active”).pluck(:UserID)
#active_user_bycountry = User.group(:UserCountry).where(UserID: #user_active.to_a).count(:UserID)
#total_user_bycountry = User.group(:UserCountry).count(:UserID)
I have searched thoroughly on SO and Google but havent found any fitting answers yet. A close answer was Difficulty with ActiveRecord query that does percentage calculation, but even this is not working for me.
Try this:
User.joins(:status).select("users.country AS user_country", "statuses.status AS user_status", "COUNT(*) AS count", "ROUND((COUNT(*)*100.0/(SELECT COUNT(*) FROM statuses WHERE status = 'Active')), 2) AS percentage").where("user_status = 'Active'").group("user_country")
You can clone this project to test it yourself.
The steps:
You need to joins the users and statuses table
Next, you need to select columns - in this case, we select 2 columns; country (from users) and status (from statuses)
Just for display purpose, I added another column called count to tell us total active users in each country
Next, we take total active users in each country and times by 100.0 and divide by total active users in the statuses table
Since you want 2 decimal place only, we need to use the ROUND function
We do number 4 and 5 in one long statement and we alias the result column name as percentage
We also need to tell ActiveRecord that we only want active users - look at that where part
Lastly, we need to group by country
Hope this helps.
Update
Example of the result:
+----+--------------+-------------+-------+------------+
| id | user_country | user_status | count | percentage |
+----+--------------+-------------+-------+------------+
| | Brazil | Active | 7 | 20.59 |
| | China | Active | 9 | 26.47 |
| | India | Active | 11 | 32.35 |
| | USA | Active | 7 | 20.59 |
+----+--------------+-------------+-------+------------+
id | parent_id | name
-------------------------
1 | null | World
2 | 1 | Sri Lanka
3 | 1 | America
4 | 2 | South Province
5 | 2 | Western Province
6 | 4 | Galle
7 | 6 | Wakwella
8 | 3 | New York
I need a MySQL query or stored procedure that calls itself recursively and returns all nodes,child nodes and leaf nodes for selected "id" .
As a example:
When i want to select all child of id=2
Result should be,
South Province
Western Province
Galle
Wakwella
When i want to select all child of id=3
Result should be,
New York
a similar question was answered here:
https://dba.stackexchange.com/questions/7147/find-highest-level-of-a-hierarchical-field-with-vs-without-ctes/7161#7161
You need to use stored procedures for this.
Assume I have the following tables:
tableA
a_name | age | country
Jordan | 5 | Germany
Molly | 6 | Spain
Paris | 7 | France
tableB
b_name | age | country
Kyle | 5 | Germany
Bob | 6 | Spain
Bob | 7 | Spain
Stephen | 7 | France
Kyle | 9 | France
Mario | 2 | Mexico
I want to make it such that I can produce a tableC that contains:
id (auto increment primary key) | age | country | country_marker
1 | 5 | Germany | 1
2 | 6 | Spain | 2
3 | 7 | France | 3
4 | 7 | Spain | 2
5 | 8 | France | 3
6 | 9 | France | 3
7 | 2 | Mexico | 4
For the new table:
takes any unique "age, country" pair only and putting them into tableC with "country_marker" automatically assigning incrementing unique numbers based on distinct "country"
Note there is no countries table, as the "country" in tableC is just based on whatever countries are in tableA, and tableB and country_marker is just a system generated identifier to indicate the unique countries in the table. Output tableC "id and country marker" ordering does not matter as long as it meets the bullet above.
I have painfully tried to produce this using MySQL cursors and want to know if its possible/what is some SQL query or set of queries I could use that would make this faster than using cursors.
I'd approach this by first creating table C in something like phpmyadmin. Then I'd write a query to pull the information from tables A and B that I want to put into C. Then, within phpmyadmin, I'd export the results of the query to a .sql file on my local machine. I'd open the .sql file to make the necessary modifications to the INSERT statements so that I can import the data in the .sql file to table C.
This isn't exactly the most efficient way, but it seems like the least technical way that would definitely work.