MySql total is bigger than individual elements - mysql

I have a huge MySql table that is in use in production. I have tables which are named by geographical areas with foreign keys for smaller divisions. It is structured like db_country-> tbl_city1, tbl_city2, tbl_city3 and each tbl_city has rows for households inside with other details like street, address and number of people for each household.
PROBLEM: If I pull the data for each family grouped by STREETS I get the correct number of people but when I call the number of people in the entire city, I get a slightly inflated number. I know there must be something I've messed up, what is the possible fail area in this scenario?
The first query that yields inflated output is:
SELECT sum(people) AS people FROM city WHERE division=2136 AND status=1;
and the second query that yields correct output is:
SELECT street.name, SUM(people) AS people FROM city INNER JOIN streets
ON city.street=streets.id WHERE division=2136 AND status=1
GROUP BY street.name;
The picture is the real output, above table shows combined total(inflated) while the lower one shows individual streets/villages with the correct number MYSQL OUTPUT IMAGE

There are records in city table which have no related street record matching the join city.street=streets.id.
Try this sql to find out the missed records.
Select city.* from city left join street on city.street=streets.id
WHERE division=2136 AND status=1 where street.id is null.

Related

MYSQL Group By Returning Duplicate Values

I am seeing a weird problem with MYSQL GROUP BY.
I have a query...
SELECT schools.schoolregion,
Count(schools.schoolregion) AS regioncount,
(
SELECT Count(jobs_jobsubject)
FROM 'jobs'
WHERE 'jobs_createdDate' BETWEEN '$startofyear'
AND '$endofyear') AS regionjobstotal
FROM 'jobs'
LEFT JOIN 'schools'
ON 'jobs_schoolID'='SID'
WHERE 'jobs_createdDate' BETWEEN '$startofyear'
AND '$endofyear'
GROUP BY 'schoolRegion'
...in which I am attempting to total the number of job postings listed per region and group by region. I have two tables, one with a list of schools and another with job information that has a column value that joins back to the school. I need the region total, and the overall total of jobs within a time period (hence the sub query).
When I run this query, I get everything that I expect - except that I am getting a duplicate region listing in the returned results of the GROUP BY function.
For example, here is the table that I am getting but not sure why the duplicate for the Middle East.
schoolRegion regioncount regionjobstotal
Africa 1 38
Asia 6 38
Middle East 20 38
Middle East 11 38
I thought maybe there was an extra character or something, but I could not find/see anything different about the values within the tables - which for that column is being stored as type "text". Is there anything I can check for? Is it something to do with the query?
Any help would be fantastic and much appreciated!!
My guess is that the data is not ordered by schoolRegion. I would add an ORDER BY schoolRegion ASC to your query to ensure that they are organized thusly. :)
OMG, do I feel like a noob!!
When I adjusted the query to list the schools, there was only one school that was not included in the GROUP BY. Initially when I looked at this hours ago, inline editing in PHPMYADMIN didn't show that there was a character return AFTER the text - so I wrote off that it was the text of the value being stored. But when I checked the box to edit the row individually and not inline and went to that column value - low and behold - a carriage return!!! Sometimes it's the little things like that which kill and humble me.
First, i do not think you can supply a child select statement as a column in your parent select statement "(SELECT COUNT(jobs_jobSubject)...".
Also since the where clause for your child and parent select are thesame, why not use a single select statement and get the count of both.
SELECT schools.schoolRegion,
COUNT(schools.schoolRegion) AS regioncount,
COUNT(jobs_jobSubject) AS regionjobstotal
FROM 'jobs' jb
INNER JOIN 'schools' sc ON jb.jobs_schoolID=sc.SID
WHERE 'jobs_createdDate'
BETWEEN '$startofyear' AND '$endofyear'
GROUP BY 'schoolRegion'

Query using two tables with DISTINCT

I have two tables - clients and - group
I need to get county and zip from clients and group-assigned from group
When I search, I cannot get distinct results, that is, instead of the output showing 100 clients with zipcode 12345 in jones county in main st group.
I need to have each zip and county listed once by group. I have googled and attempted many ways but it is just beyond me.
Can anyone assist in steering me to the correct way
Adding GROUP BY group, city, zip to the end of your query should get you what you need. It will only return unique combinations of the three.
Presumably you have something like:
select g.*, c.county, c.zip
from clients c join groups g on <some join condition>
You want one result per group. So, add a group by clause such as:
group by g.id -- assuming id uniquely identifies each group
This will give an arbitrary value for the other fields, which may be sufficient for what you are doing. (This uses a MySQL features called Hidden Columns.)

Query MySQL for rows that share a value, and returning them as columns?

This is for a homework assignment. I haven't copy-pasted the question below, I made an simpler version of it that focuses on the specific area where I'm stuck.
Let's say I have a table of two values: a person's name, and the place he had lunch yesterday. Assume everyone has lunch in pairs. How can I query the database to return all the pairs of people that had lunch together yesterday? Each pair must be only listed once.
I'm actually not even sure what the professor means by return them as pairs. I've sent him an email, but no reply yet. It seems like he wants me to write a query that returns a table with column 1 as person 1 and column 2 as person 2.
Any suggestions on how to go about this? Does it seem right to assume he wants them as separate columns?
So far, I basically have:
SELECT name, restaurant FROM lunches GROUP BY restaurant, name
which essentially just reorganizes the table so that the people who had lunch together are one after the other.
We have to assume there can be only one pair eating lunch in a given restaurant.
You can get a list of pairs either using self-join:
SELECT l1.name, l2.name FROM lunches l1
JOIN lunches l2
ON l1.restaurant = l2.restaurant AND l1.name < l2.name
or using GROUP BY:
SELECT GROUP_CONCAT(name) FROM lunches
GROUP BY restaurant
The first query will return pairs in two different columns, while the second in one column, using comma as separator (default for GROUP_CONCAT, you can change it to whatever you wish).
Also note that for the first query names in pairs will come in alphabetical order as we use < instead of <> to avoid listing each pair twice.

What is the best way to count rows in a mySQL complex table

I have a table with the following fields (for example);
id, reference, customerId.
Now, I often want to log an enquiry for a customer.. BUT, in some cases, I need to filter the enquiry based on the customers country... which is in the customer table..
id, Name, Country..for example
At the moment, my application shows 15 enquiries per page and I am SELECTing all enquiries, and for each one, checking the country field in customerTable based on the customerId to filter the country. I would also count the number of enquiries this way to find out the total number of enquiries and be able to display the page (Page 1 of 4).
As the database is growing, I am starting to notice a bit of lag, and I think my methodology is a bit flawed!
My first guess at how this should be done, is I can add the country to the enquiryTable. Problem solved, but does anyone else have a suggestion as to how this might be done? Because I don't like the idea of having to update each enquiry every time the country of a contact is changed.
Thanks in advance!
It looks to me like this data should be spread over 3 tables
customers
enquiries
countries
Then by using joins you can bring out the customer and country data and filter by either. Something like.....
SELECT
enquiries.enquiryid,
enquiries.enquiredetails,
customers.customerid,
customers.reference,
customers.countryid,
countries.name AS countryname
FROM
enquiries
INNER JOIN customers ON enquiries.customerid = customers.customerid
INNER JOIN countries ON customers.countryid = countries.countryid
WHERE countries.name='United Kingdom'
You should definitely be only touching the database once to do this.
Depending on how you are accessing your data you may be able to get a row count without issuing a second COUNT(*) query. You havent mentioned what programming language or data access strategy you have so difficult to be more helpful with the count. If you have no easy way of determining row count from within the data access layer of your code then you could use a stored procedure with an output parameter to give you the row count without making two round trips to the database. It all depends on your architecture, data access strategy and how close you are to your database.

What is wrong with this MySQL Query? need help to understand

I have 4 tables namely,
countries, states, cities, areas,
apart from countries table the rest three(states,cities,areas) contains country_id foreign key.
i wanted to return the total number of count of country_id combined in three tables for which i used jon_darstar's solution, here is the code i am using.
SELECT COUNT(DISTINCT(states.id)) + COUNT(DISTINCT(cities.id)) + COUNT(DISTINCT(areas.id))
FROM states
JOIN cities on cities.country_id = states.country_id
JOIN areas on areas.country_id = states.country_id
WHERE states.country_id IN (118);
the above code works perfectly fine although i am unable to understand the code properly, mainly the first line i.e
SELECT COUNT(DISTINCT(states.id)) + COUNT(DISTINCT(cities.id)) + COUNT(DISTINCT(areas.id))
Question 1 : doesn't that select the
primary id of the three tables
states,cities and areas and make the
count? i know this is not happening
from the result i am getting then what
is actually happening here?
However if i remove the DISTINCT from the query string it shows me a different result i.e a count of 120 whereas actually it should be 15(this is the count number of country_id in all three tables).
Question 2 : What is happening if i
use DISTINCT and remove DISTINCT?
isn't DISTINCT supposed to remove any
duplicate values. where is duplication
happening here?
thank you..
For an example, if in a country A(having primary id a.id=118),there is State B(having primary id b.id),inside that state there is City C(having primary id c.id), In city C there's Area D(having primary id d.id),E(having primary id e.id),F(f.id).lets visualize the query result in a database table.
C St Ct Ar
A->B->C->D
A->B->C->E
A->B->C->F
(Here C=Country,St=States,Ct=Cities,Ar=Areas)
Now just think what happens when you do count on above table to get total number of States within Country A without distinct.The result is 3,this way the Number of Cities is 3 and areas is 3,total 9.Because without distinct you're getting duplicate values in which you're not interested.
Now,if you use distinct count you'll get correct result cause here distinct states under
country A is 1,City is 1 and Areas is 3,total:5(excluding duplicate values)..
Hope this works!
!!Design Issue!!!
Like to add something:From your database design,i can see that you're using country id as a reference for countries from country table(to states,areas and cities) then joining states and cities then states and areas (by their country id)don't you think it's creating cross join?.Better design choice is at areas table keep foreign key of city,this way go bottom up like in city keep states and in states keep country.Or make a table for Areas where you are keeping Countries,States,Cities foreign key and areas primary key.