MYSQL Group By Returning Duplicate Values

MYSQL Group By Returning Duplicate Values - mysql

I am seeing a weird problem with MYSQL GROUP BY.
I have a query...
SELECT schools.schoolregion,
Count(schools.schoolregion) AS regioncount,
(
SELECT Count(jobs_jobsubject)
FROM 'jobs'
WHERE 'jobs_createdDate' BETWEEN '$startofyear'
AND '$endofyear') AS regionjobstotal
FROM 'jobs'
LEFT JOIN 'schools'
ON 'jobs_schoolID'='SID'
WHERE 'jobs_createdDate' BETWEEN '$startofyear'
AND '$endofyear'
GROUP BY 'schoolRegion'
...in which I am attempting to total the number of job postings listed per region and group by region. I have two tables, one with a list of schools and another with job information that has a column value that joins back to the school. I need the region total, and the overall total of jobs within a time period (hence the sub query).
When I run this query, I get everything that I expect - except that I am getting a duplicate region listing in the returned results of the GROUP BY function.
For example, here is the table that I am getting but not sure why the duplicate for the Middle East.
schoolRegion regioncount regionjobstotal
Africa 1 38
Asia 6 38
Middle East 20 38
Middle East 11 38
I thought maybe there was an extra character or something, but I could not find/see anything different about the values within the tables - which for that column is being stored as type "text". Is there anything I can check for? Is it something to do with the query?
Any help would be fantastic and much appreciated!!

My guess is that the data is not ordered by schoolRegion. I would add an ORDER BY schoolRegion ASC to your query to ensure that they are organized thusly. :)

OMG, do I feel like a noob!!
When I adjusted the query to list the schools, there was only one school that was not included in the GROUP BY. Initially when I looked at this hours ago, inline editing in PHPMYADMIN didn't show that there was a character return AFTER the text - so I wrote off that it was the text of the value being stored. But when I checked the box to edit the row individually and not inline and went to that column value - low and behold - a carriage return!!! Sometimes it's the little things like that which kill and humble me.

First, i do not think you can supply a child select statement as a column in your parent select statement "(SELECT COUNT(jobs_jobSubject)...".
Also since the where clause for your child and parent select are thesame, why not use a single select statement and get the count of both.
SELECT schools.schoolRegion,
COUNT(schools.schoolRegion) AS regioncount,
COUNT(jobs_jobSubject) AS regionjobstotal
FROM 'jobs' jb
INNER JOIN 'schools' sc ON jb.jobs_schoolID=sc.SID
WHERE 'jobs_createdDate'
BETWEEN '$startofyear' AND '$endofyear'
GROUP BY 'schoolRegion'

Related

SSRS report - add extra table and chart to show occurence of character string in one of the main report's columns

I have an SSRS report based on stored procedure dataset. The report shows employees and their performance rating and uses bunch of parameters to filter the data.
Now I would like to add a table below that would dynamicaly count and show occurence of given mark in the main report. The table data should update according to what is visible in the main report after filtering it.
I wanted also to add a chart that would visualize this.
It would be feasible to do if the extra table and chart could run from the same dataset as the main report. This however seems impossible as this dataset does not always contain all the possible marks. It can happen that some marks are missing (when filtered or missing at all), and I would like to show the mark with zero value (and zero value bar in the chart) instead of just skipping it.
So far I was able to produce the table by hardcoding the headers and using SUM(IIF...) expressions under each header
Here is the expression for the "C" column.
=Sum(IIf(Fields!current_performance_rating.Value = "C", 1, 0))
It shows correctly the number of "C" marks appearing in the main report.
Now I am stuck with creating a chart that would show this.
I am not able to hardcode expressions similar to the ones in the table
and can't make the chart run from the main dataset, because the categories
would be missing after filtering the report (and not showing zero).
I tried linking datasets with Lookup function, but that did not work.
Which way should I go now? What is the best practice in such case?
Thank you for any hints!
Thanks trubs.
I have right joined a view that contains
all the marks and that solves the issue
of missing categories.
The join is on something like
tb.current_performance_rating = vw.performance_rating_code
I can now add the value series that counts
the occurence of current_performance_rating
per category. This works all fine.
However there is another table joined
(on employee_id) that stores last year's rating.
This rating obviously may differ to the current one.
On the same chart I would like to add another
series that counts last year's rating per category.
The category is there already, joined to the current
rating.
So you can have row like:
curren rating | last year's rating | category
C | H | C
So I am stuck, because when SSRS groups per category
it counts last year's H rating and displays
i the C category, while it should display it in H.
Sorry I can't post any pictures, seems like I
need more reputation points.
Hope you can understand what I mean.
Regards!

You're likely better off changing your query to return results for marks where there are no values.
So instead of inner joining to your Performance Rating table, use a Right Join or Full Outer Join, so that all the data is always available.
eg: Instead of...
SELECT p.PersonId, PersonName, g.Grade
FROM Person p
INNER JOIN PersonGrades pg ON p.PersonId = pg.PersonId
INNER JOIN Grades g ON pg.GradeId = pg.GradeId
Use
SELECT p.PersonId, PersonName, g.Grade
FROM Grades g
LEFT JOIN PersonGrades pg ON pg.GradeId = g.GradeId
LEFT JOIN Person p ON p.PersonId = pg.PersonId
If that's not clear, post your query and we could help get the data in the right format.

The way to go was to simplify the dataset that the main report was based on.
Then I took the query from that dataset, used COUNT and GROUB BY to count number of occurence of each current year mark in the main report. Then I right joined (thx trubs!) with the view that contains all the possible marks (so that my extra table headers/chart categories would not disappear if they do not exist). As a result I got the number of occurence in current year per mark (table A).
Then I did almost exactly the same for the last year's rating, just used last year's mark. I got the number of occurence of last year's mark per mark (table B).
I inner joined table A and B on common column (the mark, which will always appear thanks to the RIGHT joins). This gave me a table (dataset eventualy) where I had:
mark | current year mark count | last year mark count.
Making a table and chart basing on this was really easy then.
There was some more fun of adding all the parameters from the main report to both the count queries, so that the counts would change when report is filtered. I also needed to make sure that count works not only when my filter criteria (in WHERE stetement) equal the parameter provided from the report, but also when they are NULLs (so I added OR column_i_filter_on IS NULL).
This works smoothly - the table content and chart changes when filtering changes (although runs slowly, as parameters are passed to two big dataset, one of which uses them twice).
Thanks for all the help!!
psh

Selecting values where the first 6 characters match

I'm trying to pull a list of IDs from a table Company where the first 6 characters of the ID are the same. The way our application creates a company ID is it takes the first 3 characters of the company name and the first 3 characters of the City. Beceause of that, overtime we have company IDs with the same first 6 characters, followed by a sequential number...
I was thinking using something using LIKE
Select companyID, companyName from Company Where
substring(companyID,1,6)+'%' like substring(companyID,1,6)+'%'
Basically i'm trying to get all company IDs where the first 6 characters match; The result set should show the just the top company ID ( The first 1 created) and the company name. I'm not expecting a tone of results, so i can then use the IDs returned to find the IDs below it.
I'm thinking it could maybe also be done using HAVING, where the count of IDs with the same first 6 characters are the same HAVING Count(*)>1??
Not really sure what the syntax would be...

SELECT distinct c1.CompanyID, c1.CompanyName, c2.CompanyID, c2.CompanyName
FROM dbo.Company c1
JOIN dbo.Company c2
ON SUBSTRING(c1.CompanyName,1,6) = SUBSTRING(c2.CompanyName,1,6)
AND c1.CompanyID < c2.CompanyID
order by c1.CompanyName, c2.CompanyName

SELECT c1.CompanyID, c1.CompanyName, c2.CompanyID, c2.CompanyName
FROM dbo.Company c1
INNER JOIN dbo.Company c2
ON SUBSTRING(c1.CompanyName,1,6) + '%' LIKE SUBSTRING(c2.CompanyName,1,6) + '%'
AND c1.CompanyID <> c2.CompanyID

If this is something that you envision doing frequently, I'd add a computed column to the table that has a definition of substring(CompanyName, 1, 6). You can then index it and make this efficient. As it is, it will have to scan all the entries and calculate the substring on the fly. With the computed column, you amortize the substring calculation up front and at least have a chance at an efficient query.

After trying to use Blam's script, i made a few slight changes and got some better results. His script was returning more results than rows in the table and it was pretty slow; think it's because of the company_name column. I got rid of it and wrote it like this:
select distinct c1.cmp_id, count(substring(c2.cmp_id,1,6)) as TotalCount
from company c1
join company c2 on substring(c1.cmp_id,1,6)=substring(c2.cmp_id,1,6)
group by c1.cmp_id
order by c1.cmp_id asc
This still returns all the table records, but atleast i can see the total count when the first 6 characters are listed more than once. Also, it ran in only 1 second so that's also a plus. Thank again for you input guys, always appreciated!

Find first, second, third, and so forth record per person

I have a 1 to many relationship between people and notes about them. There can be 0 or more notes per person.
I need to bring all the notes together into a single field and since there are not going to be many people with notes and I plan to only bring in the first 3 notes per person I thought I could do this using at most 3 queries to gather all my information.
My problem is in geting the mySQL query together to get the first, second, etc note per person.
I have a query that lets me know how many notes each person has and I have that in my table. I tried something like
SELECT
f_note, f_person_id
FROM
t_person_table,
t_note_table
WHERE
t_person_table.f_number_of_notes > 0
AND t_person_table.f_person_id = t_note_table.f_person_id
GROUP BY
t_person_table.f_person_id
LIMIT 1 OFFSET 0
I had hoped to run this up to 3 times changing the OFFSET to 1 and then 2 but all I get is just one note coming back, not one note per person.
I hope this is clear, if not read on for an example:
I have 3 people in the table. One person (A) has 0 notes, one (B) with 1 and one (C) with 2.
First I would get the first note for person B and C and insert those into my person table note field.
Then I would get the second note for person C and add that to the note field in the person table.
In the end I would have notes for persons B and C where the note field for person C would be a concatination of their 2 notes.

Welcome to SO. The thing you're trying to do, selecting the three most recent items from a table for each person mentioned, is not easy in MySQL. But it is possible. See this question.
Select number of rows for each group where two column values makes one group
and, see my answer to it.
Select number of rows for each group where two column values makes one group
Once you have a query giving you the three rows, you can use GROUP_CONCAT() ... GROUP BY to aggregate the note fields.

You can get one note per person using a nested query like this:
SELECT
f_person_id,
(SELECT f_note
FROM t_note_table
WHERE t_person_table.f_person_id = t_note_table.f_person_id
LIMIT 1) AS note
FROM
t_person_table
WHERE
t_person_table.f_number_of_notes > 0
Note that tables in SQL are basically without a defined inherent order, so you should use some form or ORDER BY in the subquery. Otherwise, your results might be random, and repeated runs asking for different notes might unexpectedly return the same data.
If you only aim for a concatenation of notes in any case, then you can use the GROUP_CONCAT function to combine all notes into a single column.

How to query 3 mysql tables and return matching results (with one to many relationships)?

I am trying to query a database to return some matching records and can't work out how to do it in the most efficient way. I have a TUsers table, a TJobsOffered table and a TJobsRequested table. The UserID is the primary key for the TUsers table and is used within the Job tables in a one to many relationship.
Ultimately I want to run a query that returns a list of all matching users based on a particular UserID (eg a matching user is one that has at least one matching record in both tables, eg if UserA has jobid 999 listed in TJobsOffered and UserB has jobid 999 listed in TJobsRequested then this is a match).
In order to try and get my head around it i've simplified it down a lot and am trying to match the records based on the jobids for the user in question, eg:
SELECT DISTINCT TJobsOffered.FUserID FROM TJobsOffered, TJobsRequested
WHERE TJobsOffered.FUserID=TJobsRequested.FUserID AND
(TJobsRequested.FJobID='12' OR TJobsRequested.FJobID='30') AND
(TJobsOffered.FJobID='86' OR TJobsOffered.FJobID='5')
This seems to work fine and returns the correct results however when I introduce the TUsers table (so I can access user information) it starts returning incorrect results. I can't work out why the following query doesn't return the same results as the one listed above as surely it's still matching the same information just with a different connector (or is the one above effectively many to many and the one below 2 sets of one to many comparisons)?
SELECT DISTINCT TUsers.Fid, TUsers.FName FROM TUsers, TJobsOffered, TJobsRequested
WHERE TUsers.Fid=TJobsRequested.FUserID AND TUsers.Fid=TJobsOffered.FUserID AND
(TJobsRequested.FJobID='12' OR TJobsRequested.FJobID='30') AND
(TJobsOffered.FJobID='86' OR TJobsOffered.FJobID='5')
If anyone could explain where i'm going wrong with the second query and how you should incorporate TUsers then that would be greatly appreciated as I can't get my head around the join. If you are able to give me any pointers as to how I can do this all in one query by just passing the user id in then that would be massively appreciated as well! :)
Thanks so much,
Dave

Try this
SELECT DISTINCT TJobsOffered.FUserID , TUsers.FName
FROM TJobsOffered
INNER JOIN TJobsRequested ON TJobsOffered.FUserID=TJobsRequested.FUserID
LEFT JOIN TUsers ON TUsers.Fid=TJobsOffered.FUserID
WHERE
(TJobsRequested.FJobID (12,30) AND
(TJobsOffered.FJobID IN (86 ,5)

You need to add "AND TJobsOffered.FUserID=TJobsRequested.FUserID" to your where clause.

Query MySQL for rows that share a value, and returning them as columns?

This is for a homework assignment. I haven't copy-pasted the question below, I made an simpler version of it that focuses on the specific area where I'm stuck.
Let's say I have a table of two values: a person's name, and the place he had lunch yesterday. Assume everyone has lunch in pairs. How can I query the database to return all the pairs of people that had lunch together yesterday? Each pair must be only listed once.
I'm actually not even sure what the professor means by return them as pairs. I've sent him an email, but no reply yet. It seems like he wants me to write a query that returns a table with column 1 as person 1 and column 2 as person 2.
Any suggestions on how to go about this? Does it seem right to assume he wants them as separate columns?
So far, I basically have:
SELECT name, restaurant FROM lunches GROUP BY restaurant, name
which essentially just reorganizes the table so that the people who had lunch together are one after the other.

We have to assume there can be only one pair eating lunch in a given restaurant.
You can get a list of pairs either using self-join:
SELECT l1.name, l2.name FROM lunches l1
JOIN lunches l2
ON l1.restaurant = l2.restaurant AND l1.name < l2.name
or using GROUP BY:
SELECT GROUP_CONCAT(name) FROM lunches
GROUP BY restaurant
The first query will return pairs in two different columns, while the second in one column, using comma as separator (default for GROUP_CONCAT, you can change it to whatever you wish).
Also note that for the first query names in pairs will come in alphabetical order as we use < instead of <> to avoid listing each pair twice.

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008