MySQL find averages based on multiple factors - mysql

I have table that does something like this
+--------------------------+--------+------+---------+
| | City | Year | Density |
+--------------------------+--------+------+---------+
| Project 1 | City A | 2008 | 500 |
+--------------------------+--------+------+---------+
| Project 2 | City B | 2012 | 800 |
+--------------------------+--------+------+---------+
| Project 3 | City C | 2012 | 400 |
+--------------------------+--------+------+---------+
| Project 4 | City A | 2008 | 600 |
+--------------------------+--------+------+---------+
| Project 5 | City C | 2013 | 700 |
+--------------------------+--------+------+---------+
| etc (c. 30,000 projects spread across 30 cities) |
+--------------------------+--------+------+---------+
(About 30,000 projects spread across 30 cities.)
I can write a query like:
SELECT Year, AVG(`Density`) as Density FROM table where City=’A’ GROUP BY Year
Which works fine for one city. Could anyone point me in the right direction as to how I write a single query that would calculate the average by year for each city? I’d anticipate a results table that looked something like this:
+------+--------+--------+--------+-------------+
| | City A | City B | City C | City D, etc |
+------+--------+--------+--------+-------------+
| 2005 | | | | |
+------+--------+--------+--------+-------------+
| 2006 | | | | |
+------+--------+--------+--------+-------------+
| 2008 | | | | |
+------+--------+--------+--------+-------------+
| 2009 | | | | |
+------+--------+--------+--------+-------------+
| 2010 | | | | |
+------+--------+--------+--------+-------------+
| etc | | | | |
+------+--------+--------+--------+-------------+
I have tried to use a subquery in the where clause (where in (select distinct City)) but that did not behave as I expected.
Or do I just have to do a separate line for each of the 30 cities by hand?
I am no expert with MySQL and can't see conceptually what I need to do. If anyone could give me any pointers I would be very grateful. Thanks.

You can group by multiple columns:
SELECT city, year, AVG(density) AS density
FROM table
GROUP BY city, year
This will return a separate row for each city/year combination. To get cities as columns, you'll need to pivot it. See MySQL pivot table

Related

Counting "subcolumns" with mySQL

Guys let me make myself clear. I'm studying MYSQL and practicing the function "count()". I have a table called "City", where I have ID, name, CountryCode, district, and Population. My first idea was to know how many cities I have by country
SELECT *, Count(name) as "total" FROM world.city GROUP BY countrycode;
It worked, an extra column was created with the number of cities by each country. I would like to know how many countries I have by counting the number of distinct rows (I know that a have this information on the bottom of the WorkBench, but I would like to know to make this information appear on my query). I tried to add a Count(CountryCode), but it didn't work as I was expecting, a number 4079 appeared, which is the total number of cities that I have. I figured out that my "Count()" is calculating the number of rows inside each Country, not counting the number of codes that I have for each country. Is that possible to get this information?
(A mini-lesson for a Novice.)
The first thing to learn is that COUNT(*) is the usual way to use COUNT. And you get the number of rows. In contrast, COUNT(name) counts the number of rows with non-NULL name values.
Then comes the way to use DISTINCT. It is not a function. So COUNT(DISTINCT a,b) counts the number of different combinations of a and b. And COUNT(DISTINCT(a)) though it works 'fine' and 'correctly', the parens are redundant. So use COUNT(DISTINCT a).
Don't use * with GROUP BY. That is, SELECT *, ... GROUP BY ... is improper. The usual way to say something like your query is
SELECT countrycode, COUNT(*) AS "total"
FROM world.city
GROUP BY countrycode;
For provinces in Canada (which I happen to have a table of):
SELECT province, COUNT(*) AS "total" FROM world.canada GROUP BY province;
+---------------------------+-------+
| province | total |
+---------------------------+-------+
| Alberta | 573 |
| British Columbia | 716 |
| Manitoba | 299 |
| New Brunswick | 210 |
| Newfoundland and Labrador | 474 |
| Northwest Territories | 94 |
| Nova Scotia | 331 |
| Nunavut | 107 |
| Ontario | 891 |
| Prince Edward Island | 57 |
| Quebec | 1045 |
| Saskatchewan | 573 |
| Yukon | 114 |
+---------------------------+-------+
Note that a few cities show up in multiple provinces:
SELECT COUNT(DISTINCT city), COUNT(*) FROM world.canada;
+----------------------+----------+
| COUNT(DISTINCT city) | COUNT(*) |
+----------------------+----------+
| 5248 | 5484 |
+----------------------+----------+
Munch on this; there are some more lessons to learn:
SELECT city, COUNT(*) AS ct, GROUP_CONCAT(DISTINCT state)
FROM world.us
GROUP BY city
ORDER BY COUNT(*)
DESC LIMIT 11;
+-------------+----+----------------------------------+
| city | ct | GROUP_CONCAT(DISTINCT state) |
+-------------+----+----------------------------------+
| Springfield | 11 | FL,IL,MA,MO,NJ,OH,OR,PA,TN,VA,VT |
| Clinton | 10 | CT,IA,MA,MD,MO,MS,OK,SC,TN,UT |
| Madison | 8 | AL,CT,IN,ME,MS,NJ,SD,WI |
| Lebanon | 8 | IN,ME,MO,NH,OH,OR,PA,TN |
| Auburn | 7 | AL,CA,IN,ME,NH,NY,WA |
| Burlington | 7 | IA,MA,NC,NJ,VT,WA,WI |
| Washington | 7 | DC,IL,IN,MO,NC,PA,UT |
| Farmington | 7 | ME,MI,MN,MO,NH,NM,UT |
| Canton | 6 | GA,IL,MA,MI,MS,OH |
| Monroe | 6 | GA,LA,MI,NC,WA,WI |
| Lancaster | 6 | CA,NY,OH,PA,SC,TX |
+-------------+----+----------------------------------+
As for the number of cities in a country, that belongs in a the table Countries, not in the table Cities. Then use a JOIN when you want to put them together.

MySQL Table -> How to Shift Column Data to the next Column and bring the last column to the first place (Rotating Columns Data)

I have this problem - where i have (time Schedule) saved in mysql table
the table is for example like this :-
+----+--+------+--+--+--------------+--+--------------+
| day | | Group A | | | Group B | | Group C
+----+--+------+--+--+--------------+--+--------------+
| sat| | physics | | | Language | | Algebra
| sun| | Chemistry| | | Math | | Science
| mon| | History | | | French | | GYM
| ...| | ..... | | | ....... | |
+-----+--+----------+--+--+------------+--+-----------+
So at the end of every month - there is a rotation in the schedule - So the result should be
+----+--+------+--+--+--------------+--+--------------+
| day | | Group A | | | Group B | | Group C
+----+--+------+--+--+--------------+--+--------------+
| sat| | Algebra | | | physics | | Language
| sun| | Science | | | Chemistry| | Math
| mon| | GYM | | | History | | French
| ...| | ..... | | | ....... | |
+-----+--+----------+--+--+------------+--+-----------+
SO the subjects of Group B is the Subject of Group A ...> See the tables you should understand it better
So it is like rotation of the columns - i have a server side script that will run once at the end of the month to update the schedule
except i don't have any clue of how to achieve this with MySql - also i have a quite large number of groups (32 to be specific )
so any idea to how to reach this result ?
Not an answer. Too long for a comment.
An example of a normalised design might look like this
day group_ref subject
sat A physics
sat B language
sat C algebra
sun A chemistry
sun B math
sun C science
mon A history
mon B french
mon C gym
Note however that there is scope here for further optimisation

Poor UNION ALL performance in MySQL

I have a database with rows like the following:
+------------+---------+------------+-------+
| continent | country | city | value |
+------------+---------+------------+-------+
| Asia | China | Beijing | 3 |
| ... | ... | ... | ... |
| N. America | USA | D.C | 7 |
| .... | .... | .... | .... |
In order to generate a treemap visualization, I need to work this into a table with the following shape:
+-----+------------+-------+
| uid | parent-uid | value |
+-----+------------+-------+
In this case, Asia is the "parent" for China, which is the "parent" for Beijing. So for those three you'd have something like:
+---------+--------+-----+
| Beijing | China | 3 |
| China | Asia | ... |
| Asia | global | ... |
+---------+--------+-----+
The "value" for China needs to be an aggregate of all child values. Similarly the value of Asia needs to be an aggregate of all child values.
To accomplish this purely in SQL I created the following three queries and combined them with UNION ALL:
# City-level:
SELECT
CONCAT(continent, "-", country, "-", city) as uid,
CONCAT(continent, "-", country) as parentuid,
value
FROM
table
UNION ALL
# Country-level
SELECT
CONCAT(continent, "-", country) as uid,
continent as parentuid,
SUM(value) as value
FROM
table
GROUP BY
country
UNION ALL
# Continent-level
SELECT
continent as uid,
"global" as parentuid,
SUM(value) as value
FROM
table
GROUP BY
continent
Each of the individual queries completes in milliseconds. City-level, country-level, and continent-level all return results in < 0.01 seconds
When I union them all together it suddenly takes 8 seconds to get results!
I've tried Googling the issues but everything just says "Use UNION ALL instead of UNION" (I already am)
I considered that it might not have enough RAM to build the temporary results table so it's disk trashing, but I don't know how to increase the memory limit. I tried bumping innodb_buffer_pool_size to 1GB (1073741824) but it didn't help
The first select, selects all rows in the table then getting the first row is very fast but fetching all rows will take very much time(the mysql workbench append limit 1000 to end of the query by default).
To test that fetching all rows take more time, try following query and say us the time it consumes:
select * from (
SELECT
CONCAT(continent, "-", country, "-", city) as uid,
CONCAT(continent, "-", country) as parentuid,
value
FROM
table
) t1;
If it takes almost 8 seconds then your union have no problem. And for improve performance you must limit rows by using where clause.
I hope it could help.
I guess my question is: what's wrong with WITH ROLLUP?
SELECT
CONCAT_WS('-',continent,country,city) as uid,
CONCAT_WS('-',continent,COALESCE(country,'global')) as parentuid,
value
FROM (
SELECT continent, country, city, SUM(value) as value
FROM table
GROUP BY continent, country, city WITH ROLLUP
) t1
WHERE t1.continent IS NOT NULL;
I may not have the CONCAT_WS() calls correct, especially if you have cities or countries named '', but I have to think this would be faster. The WHERE clause is just there to remove the overall summary.
Here's the example for WITH ROLLUP from the MySQL doc to help explain what it does:
mysql> SELECT year, country, product, SUM(profit)
-> FROM sales
-> GROUP BY year, country, product WITH ROLLUP;
+------+---------+------------+-------------+
| year | country | product | SUM(profit) |
+------+---------+------------+-------------+
| 2000 | Finland | Computer | 1500 |
| 2000 | Finland | Phone | 100 |
| 2000 | Finland | NULL | 1600 |
| 2000 | India | Calculator | 150 |
| 2000 | India | Computer | 1200 |
| 2000 | India | NULL | 1350 |
| 2000 | USA | Calculator | 75 |
| 2000 | USA | Computer | 1500 |
| 2000 | USA | NULL | 1575 |
| 2000 | NULL | NULL | 4525 |
| 2001 | Finland | Phone | 10 |
| 2001 | Finland | NULL | 10 |
| 2001 | USA | Calculator | 50 |
| 2001 | USA | Computer | 2700 |
| 2001 | USA | TV | 250 |
| 2001 | USA | NULL | 3000 |
| 2001 | NULL | NULL | 3010 |
| NULL | NULL | NULL | 7535 |
+------+---------+------------+-------------+

MySQL- Join based on column data

I have a table which has data in the following format:
+---------------------+--------------+-------------------+-------------------+
| date | downloadtime | clientcountrycode | clientcountryname |
+---------------------+--------------+-------------------+-------------------+
| 2013-07-10 10:44:29 | 2 | USA | United States |
| 2013-07-10 10:44:25 | 4 | USA | United States |
| 2013-07-10 10:44:21 | 7 | USA | United States |
| 2013-07-10 10:44:16 | 2 | USA | United States |
| 2013-07-10 10:44:10 | 3 | USA | United States |
+---------------------+--------------+-------------------+-------------------+
I need to prepare a csv file by querying this table. The csv file should be of the following format:
clientcountryname,clientcountrycode,2013-07-05,2013-07-06,2013-07-8...
United States,USA,22,23,24
SO, basically I need to get the average downloadtime for each country for each day.
I have a query which will give me avg(downloadtime) for a particular day:
SELECT clientcountryname,clientcountrycode, avg(downloadtime), FROM tb_npp where date(date) = '2013-07-10' group by clientcountrycode;
+---------------------------------------+-------------------+-------------------+
| clientcountryname | clientcountrycode | avg(downloadtime) |
+---------------------------------------+-------------------+-------------------+
| Anonymous Proxy | A1 | 118.0833 |
| Satellite Provider | A2 | 978.5000 |
| Aruba | ABW | 31.8462 |
My question is: Is there a way in SQL to group the column names based on date which is present in my database?
If I understand you question correctly, you should just be able to group by the date as well:
SELECT clientcountryname,clientcountrycode,Date, avg(downloadtime),
FROM tb_npp
GROUP BY clientcountrycode,clientCountryCode,Date;

How to Query a Table with Multiple Foreign Keys and Return Actual Values

New to MySQL, so please bear with me.
I'm working on a project that collects user's degrees. Users can save 3 degrees where the type, subject matter, and school are variable. These relations are normalized for other query uses so 5 tables are involved and are shown below (all have more columns then shown, just included the relevant info). The last one, 'user_degrees' is where the keys come together.
degrees
+----+-------------------+
| id | degree_type |
+----+-------------------+
| 01 | Bachelor's Degree |
| 02 | Master's Degree |
| 03 | Ph.D. |
| 04 | J.D. |
+----+-------------------+
acad_category
+------+-----------------------------------------+
| id | acad_cat_name |
+------+-----------------------------------------+
| 0015 | Accounting |
| 0026 | Business Law |
| 0027 | Finance |
| 0028 | Hotel & Restaurant Management |
| 0029 | Human Resources |
| 0030 | Information Systems and Technology |
+------+-----------------------------------------+
institutions
+--------+--------------------------------------------+
| id | inst_name |
+--------+--------------------------------------------+
| 000001 | A T Still University of Health Sciences |
| 000002 | Abilene Christian University |
| 000003 | Abraham Baldwin Agricultural College |
+------+----------------------------------------------+
users
+----------+----------+
| id | username |
+----------+----------+
| 00000013 | Test1 |
| 00000018 | Test2 |
| 00000023 | Test3 |
+----------+----------+
user_degrees
+---------+-----------+---------+---------+
| user_id | degree_id | acad_id | inst_id |
+---------+-----------+---------+---------+
| 18 | 1 | 4 | 1 |
| 23 | 1 | 15 | 1 |
| 23 | 2 | 15 | 1 |
| 23 | 3 | 15 | 1 |
+---------+-----------+---------+---------+
How can I query 'user_degrees' to find all degrees by user x, but return the actual values of the foreign keys? Taking user Test3 as an example, I'm looking for output like so (truncated for layout's sake):
+-------------------+-------------------+-------------------+
| degree_type | acad_cat_name | inst_name |
+-------------------+-------------------+-------------------+
| Bachelor's Degree | Accounting | A T Still Uni.. |
| Master's Degree | Accounting | A T Still Uni.. |
| Ph.D. | Accounting | A T Still Uni.. |
+-------------------+-------------------+-------------------+
I'm guessing a mix of multiple joins, temp tables and subqueries are the answer but am having trouble grasping the order of things. Any insight is much appreciated, thanks for reading.
You need to join user_degrees to degrees (and the other tables referenced by user_degrees). This is the query that will give you your example output:
SELECT
ud.user_id, d.degree_type, ac.acad_cat_name, i.inst_name
FROM
user_degrees ud
INNER JOIN degrees d ON d.id = ud.degree_id
INNER JOIN acad_category ac ON ac.id = ud.acad_id
INNER JOIN institutions i ON i.id = ud.inst_id
WHERE
ud.user_id = 18
You may also want to read this article to understand different kinds of joins: http://www.codinghorror.com/blog/2007/10/a-visual-explanation-of-sql-joins.html
The only way to understand these things at your stage of learning is to actually write the queries and then modify them until you get your desired output.