MySQL select re-ask+simplification - mysql

The original question is here.. MySQL self-referencing ID and selects
I would like to pose the question in a way with all the relation to a specific case removed.
I have the example table..
id1 id2
1 5
5 1
2 3
3 2
What SQL command would return..
id1 id2
1 5
2 3
Essentially removing the "duplicate rows".

Q1 and Q2 are the alias' I've created for your table, so we can reference the id's as if they were on different tables.
DELETE Q1 FROM table Q1
JOIN table Q2
ON Q1.id1 = Q2.id2
AND Q2.id1 = Q1.id2
WHERE Q1.id1 > Q1.id2

Related

Group by / Summing values with the same column value

i was trying to solve a problem which just looks like the code written below, but from lack of knowledge and reading through the sqlalchemy documentation, i do not really find any solution on how to solve my problem, yet.
Objective:
Get summed value of sales_in_usd if year in year_column is same
What I got so far is by debugging and reading a bit through stackoverflow and documentations, google by using following query:
session.query(fact_corporate_sales, Company, Sales,
Time, Sector, func.sum(Sales.sales_in_usd).label('summary')).\
join(Sales).\
join(Time).\
join(Company).\
join(Segment).\
order_by(Time.year.desc()).\
filter(Company.company_name.like(filtered)).\
group_by(fact_corporate_sales.fact_cps_id, Company.company_name,fact_corporate_sales.cps_id).\
all()
And well the fact_cps_id is unique in the fact_table and the same table stores, the keys of the dimension tables as well..
I have a fact table which stores 4 foreign keys from 4 dimension tables.
fact_cps_id company_id sales_id time_id sector_id
1 4 2 1 2
2 4 1 1 3
3 4 3 2 1
4 4 2 2 4
5 4 4 3 2
6 4 99 1 1
dim_company
company_id company_name
1 Nike
2 Adidas
3 Puma
4 Reebok
dim_segment
segment_id segment_nom
1 basketball
2 running
3 soccer
4 watersports
dim_time
time_id quarter year
1 1 2013
2 2 2013
3 1 2014
4 3 2014
dim_sales
sales_id sales_in_euro
1 2000
2 3200
3 1400
4 1590
.. ..
99 1931
So basically, as you can see in the table and query what I was trying to do was summing up all sales from the as example dim_Time.year <- from the same year.
If we look into the fact_table we can see, that we have time_id = 1 three times, here. So those values could be summed up and displayed as a summary.
I know from standard SQL that it was possible by using group by and aggregate function sum.
My result(time_id is only for help therefore was no output):
13132.0 <- time_id = 1
21201.0 <- time_id = 2
23923.0 <- time_id = 1
31232.0 <- time_id = 99
32021.0 <- time_id = 2
32342.0 <- time_id = 1
131231.0 <- time_id = 4
I printed the actual query into the console and got this [had to remove .all(), because 'list' has no attribute called 'statement']:
SELECT fact_corporate_sales.cps_fact_id, fact_corporate_sales.cps_id,
fact_corporate_sales.company_id, fact_corporate_sales.time_id, fact_corporate_sales.segment_id, sum(dim_corporate_sales.sales_in_usd) AS summary
FROM fact_corporate_sales INNER JOIN dim_corporate_sales ON dim_corporate_sales.cps_id = fact_corporate_sales.cps_id INNER JOIN dim_time ON dim_time.time_id = fact_corporate_sales.time_id INNER JOIN dim_company ON dim_company.company_id = fact_corporate_sales.company_id INNER JOIN dim_segment ON dim_segment.segment_id = fact_corporate_sales.segment_id
WHERE dim_company.company_name LIKE %s GROUP BY fact_corporate_sales.cps_fact_id ORDER BY dim_time.year DESC
And if I want to group by for example dim_time.Year only..I get following response from mysql or console
Error Code: 1055. Expression #1 of SELECT list is not in GROUP BY clause and contains nonaggregated column 'db.fact_corporate_sales.fact_cps_id' which is not functionally dependent on columns in GROUP BY clause; this is incompatible with sql_mode=only_full_group_by
The solution was only to execute following sql:
engine.execute("SET sql_mode='';")
As the response of my failed query was:
"this is incompatible with sql_mode=only_full_group_by"
I had to disable the sql_mode and so did I and got my result.

update query to update max of id

apologies if question is ambiguous. following is the scenario:
question master table:
id No
1 1
2 1
3 2
4 3
5 3
6 3
question part table:
qid statement
1 abc
2 xyz
3 a1235
4 abcde
5 asdf
this data now needs to be imported into a new structure
where only one qid should be present for each question no.
so in above example qid 1,2 should only be either 1 or 2 etc.
an trying to make an update query but it doesnt seem to do what exactly is needed.
the end result can be like so:
qid statement
1 abc
1 xyz
3 a1235
4 abcde
4 asdf
the query is as follows:
update questionpart qp
set Q_ID =
(
select max(nq.Q_ID) FROM newquestion nq
where nq.Q_No = (select nq2.Q_No FROM newquestion nq2 where nq2.Q_ID = qp.Q_ID )
)
any help greatly appreciated.
update questionpart qp
join newquestion nq on qp.Q_ID = nq.Q_ID
set qp.Q_ID = nq.NO;
I believe this is what you are trying to do. ( Set the question part Q_ID to be the new question/master question Q_NO value. If this is not what you are looking for please provide the table schemas.

Robust SQL query on demographics data set

I have a rather complex SQL Server query (at least to me) to write on a demographics data set. I need to figure out how many respondents in the system mathc a specific demographic.
I have 2 main tables. I will list the relevant columns. Assume there are unique ID's on each row.
Table Respondents:
[RespondentID] [SystemEntryDate]
Table RespondentProfiles:
[QuestionID] [AnswerID]
The respondent ID on Respondents links to RespondentProfiles. For each question answered, a row is created. The question id corresponds to a specific question (say gender, ethnicity, state, and car ownership) and the answer id means something different depending on the question. Like 1 is male and 2 is female, or 1 might be white, 2 hispanic, 3 pacific islander, and so on.
I also have a table called Conditions. The conditions table looks like this:
[ConditionSetID] [QuestionID] [AnswerID]
The condition set id links to the conditions together into a collection of conditions. So i can pass a condition set id to the query, and it will return a count of how many respondents meet that criteria, as well as the min and max dates from that set.
My query will look something like this:
create procedure query
#ConditionSetID int
as
select count(distinct r.ID) as Respondents,
min(r.SystemEntryDate) as EarliestDate,
max(r.SystemEntryDate) as LatestDate
from Respondents r
join RespondentProfiles rp
on r.ID = rp.RespondentID
join Conditions c
on c.ConditionSetID = #ConditionSetID
and c.QuestionID = rp.QuestionID
where rp.QuestionID = c.QuestionID
and rp.Condition = c.AnswerID
As an example, I might have a respondent profiles table like this
[RespondentID] [QuestionID] [AnswerID]
10001 1 (gender) 1 (male)
10001 2 (ethnicity) 1 (white)
10001 3 (car) 23 (lexus)
10002 1 (gender) 2 (female)
10002 2 (ethnicity) 2 (black)
10002 3 (car) 24 (buick)
10003 1 (gender) 2 (female)
10003 2 (ethnicity) 1 (white)
10003 3 (car) 5 (honda)
10004 1 (gender) 1 (male)
10004 2 (ethnicity) 2 (black)
10004 3 (car) 24 (buick)
And if I pick a specific condition set, the rows id have might be like:
[QuestionID] [AnswerID]
1 (gender) 2 (female)
2 (ethnicity) 2 (black)
3 (car) 24 (buick)
This would be asking for all the black females who own a buick, which should give em a count of 1.
Or I could have:
[QuestionID] [AnswerID]
3 (car) 23 (lexus)
3 (car) 24 (buick)
This is asking for everyone who owns a buick or lexus, which would be 3 people.
And then as a final example:
[QuestionID] [AnswerID]
2 (ethnicity) 2 (black)
3 (car) 23 (lexus)
3 (car) 24 (buick)
This is asking for everyone who is black and owns a lexus or everyone who is black and owns a buick, which would be 2 people.
I know this isn't horribly complicated, but it is the most complex thing I've attempted yet, and any help would be greatly appreciated. I'm having a lot of trouble figuring out how to set up the where clause, and even general direction would be appreciated. There are also about 800,000 records in the respondentprofiles table, so it must be efficient.
The where clause I have set up isn't quite correct, because it will only get the records as if the different questions are being or'd together as opposed to and'ed. So it will return a row for that respondent even if only one answer matches, which is wrong. A particular respondent must meet all the conditions in the condition set to be selected.
Perhaps I need to select into a temp table question at a time or something? Or use some sort of grouping? I am just really confused on where to go with this. I hope I have provided enough information to adequately demonstrate my dilemma.
The examples below show how to get the respondent IDs of respondents who answered:
To question A, Yes
To question B, No
TO question C, Yes
Assuming you are actually using SQL server (you tagged both mysql and sql server in your question), you can use:
select id
from RespondentProfiles
where QuestionID = 'a'
and AnswerID = 'yes'
intersect
select id
from RespondentProfiles
where QuestionID = 'b'
and AnswerID = 'no'
intersect
select id
from RespondentProfiles
where QuestionID = 'c'
and AnswerID = 'yes'
Or if you are using MySQL you can use:
select id
from RespondentProfiles x
where QuestionID = 'a'
and AnswerID = 'yes'
join (select id
from RespondentProfiles
where QuestionID = 'b'
and AnswerID = 'no') y
on x.id = y.id
join (select id
from RespondentProfiles
where QuestionID = 'c'
and AnswerID = 'yes') z
on y.id = z.id
Just to add to my answer what I put in the comments - there is no need for your conditions table. You do not need to have such a table in order to query for respondents who answers 2+ questions a certain way. You can use inline views and/or subqueries to accomplish that. (or in the case of sql server, the intersect set operator)

MYSQL - Add multiple users to a table column

Maybe it's because I don't understand how to search for the right verbiage, but I'm having difficulty understanding how to attach multiple users to a table with multiple columns.
Here is what I'm attempting to do:
table name: user
user_id user_name
1 abc
2 xyz
3 pqr
4 new
table2 name : brackets
id user_id bracket_name
1 4,2 bracket_1
2 4,3,1 bracket_2
3 2,1 bracket_3
4 3,4,2 bracket_4
-- OR --
table name: user
user_id user_name brackets_id
1 abc 2,3
2 xyz 1,3,4
3 pqr 2,4
4 new 1,2,4
table2 name : brackets
brackets_id user_id bracket_name
1 4,2 bracket_1
2 4,3,1 bracket_2
3 2,1 bracket_3
4 3,4,2 bracket_4
I'm using nodejs and sequalize as my ORM and understand enough to read, write, delete and update to these tables, but when it comes to organizing my data, I'm completely lost!
Is it possible to add an array to MYSQL with the user ID's or the brackets that the user is allowed to access? The bracket are generated by a user and then they can invite their friends to join a bracket. Users can join multiple brackets and join other brackets as users.
Any guidance would be very helpful!
I think a Junction Table would simplify this for you: http://en.wikipedia.org/wiki/Junction_table
It would look something like:
table name: user
user_id user_name
1 abc
2 xyz
3 pqr
4 new
table2 name : brackets
brackets_id bracket_name
1 bracket_1
2 bracket_2
3 bracket_3
4 bracket_4
table3 your junction table:
user_id brackets_id
1 2
1 3
2 1
2 3
2 4
etc.etc.

MS Access 2007 Rows to columns in recordset

I have a table which is like a questionnaire type ..
My original table contains 450 columns and 212 rows.
Slno is the person's id who answer the questionaire .
SlNo Q1a Q1b Q2a Q2b Q2c Q2d Q2e Q2f .... Q37c <450 columns>
1 1
2 1 1
3 1
4 1 1
5 1
I have to do analysis for this data , eg Number of persons who is male (Q1a) and who owns a boat (Q2b) i.e ( select * from Questionnaire where Q1a=1 and Q2b=1 ).. etc .. many more combinations are there ..
I have designed in MS access all the design worked perfectly except for a major problem ( Number of table columns is restricted to 255 ).
To be able to enter this into access table i have inserted in as 450 rows and 212 columns (now am able to enter this into access db). Now while fetching the records i want the record set to transpose the results into the form that i wanted so that i do not have to change my algorithm or logic .... How to achieve this with the minimum changes ? This is my first time working with Access Database
You might be able to use a crosstab query to generate what you are expecting. You could also build a transpose function.
Either way, I think you'll stil run into the 255 column limit and MS Access is using temporary table, etc.
However, I think you'll have far less work and better results if you change the structure of your table.
I assume that this like a fill-in-the-bubble questionnaire, and it's mostly multiple choice. In which case instead of recording the result, I would record the answer for the question
SlNo Q1 Q2
1 B
2 B
3 A
4 A C
5 A
Then you have far fewer columns to work with. And you query for where Q1='A' instead of Q1a=1.
The alternative is break the table up into sections (personal, career, etc.) and then do a join, and only show the column you need (so as not to exceed that 255 column limit).
An way to do this that handles more questions is have a table for the person, a table for the question, and a table for the response
Person
SlNo PostalCode
1 90210
2 H0H 0H0
3
Questions
QID, QTitle, QDesc
1 Q1a Gender Male
2 Q1b Gender Female
3 Q2a Boat
4 Q2b Car
Answers
SlNo QID Result
1 2 True
1 3 True
1 4 True
2 1 True
2 3 False
2 4 True
You can then find the question takers by selecting Persons from a list of Answers
select * from Person
where SlNo in (
select SlNo from Answers, Questions
where
questions.qid = answers=qid
and
qtitle = 'Q1a'
and
answers.result='True')
and SlNo in (
select SlNo from Answers, Questions
where
questions.qid = answers=qid
and
qtitle = 'Q2a'
and
answers.result='True')
I finally got the solutions
I created two table one having 225 columns and the other having 225 column
(total 450 columns)
I created a SQL statement
select count(*) from T1,T2 WHERE T1.SlNo=T2.SlNo
and added the conditions what i want
It is coming correct after this ..
The database was entered wrongly by the other staff in the beginning but just to throw away one week of work was not good , so had to stick to this design ... and the deadly is next week .. now it's working :) :)