Rows to columns in MySQL without knowing the rows in advance - mysql

I have a MySQL table that looks like this:
+---------+-------------+--------------+-------------+-------------+
| truckNo | excavatorId | times_loaded | litres | litres/time |
+---------+-------------+--------------+-------------+-------------+
| 1 | 345 | 100 | 50 | 0.5 |
+---------+-------------+--------------+-------------+-------------+
| 1 | 275 | 34 | 50 | 1.47 |
+---------+-------------+--------------+-------------+-------------+
| 2 | 275 | 100 | 50 | 0.5 |
+---------+-------------+--------------+-------------+-------------+
In this table, an Excavator loads to any truck one or more times. For example Excavator 345 loaded some material on truck 1 100 times. But Truck 1 was also loaded by excavator 275 some material 34 times. Now I want to group the results per truck, so in the left column I can see the distinct truck and have each excavator as a new column as in the following table:
+--------+---------------------------------+---------------+--------------+
| Truck | Excavators | Total_loads | Total litres | Litres/load |
+--------+-----------+-------+-------------+---------------+--------------+
| | 345 | 275 | | | |
+--------+-----------+-------+-------------+---------------+--------------+
| 1 | 100 | 34 | 134 | 1.95 | 0.01 |
+--------+-----------+-------+-------------+---------------+--------------+
| 2 | | 50 | 50 | 50 | 1 |
+--------+-----------+-------+-------------+---------------+--------------+
The problem is that I never know in advance how many excavators will there be in the table so I know how many columns to make.
Is there any way to do that ? Is it possible is SQL ?
EDIT: Each truck is loaded one or more times by any excavator. So in the result table for example truck no 1 was loaded 100 times by Excavator 345 and 34 times by excavator 275. In total loads you see all loads that were upon truck 1, from all/any excavator which is 134. Same for liters. The litres/load is the division in the result table of Total litres / Total_loads

SQL queries always generate result tables with the columns stated. If you don't know the columns you want to show, you cannot write the query.
Your options:
Select the raw data and construct the result table from that data in your app or Website.
Select the excavators first. Then use your programming language to create a query particularly for those excavators.
Do with one column containing concatenated strings.
Here is #3:
select
truckno,
group_concat('EX(', excavatorid, ')=', times_loaded order by excavatorid) as loads,
sum(times_loaded) as total_loads,
sum(litres) as total_litres,
sum(litres) / sum(times_loaded) as litres_per_load
group by truckno
order by truckno;

Related

Data output in a single row by order of importance

I'm trying to get data that is on multiple rows into a single row by order of importance.
I was working with multiple tables and was able to pull all the data I need into one table - so currently I'm working with one table where the data I need exists in multiple rows. Example a person can have more than one role. However, the roles have an order of importance - I added an order of importance column to the file I'm working with.
The file I'm working with looks like this:
ID | FIRST |LAST | ROLE | ORDER OF IMPORTANCE
116 | Jamie | Ansto | PARAL | 5
116 | Jamie | Ansto | FMREMP | 11
153 | Alan | Rond | PAR | 3
153 | Alan | Rond | PARAL | 5
155 | Maureen | Aron | GP | 4
155 | Maureen | Aron | PARAL | 5
38 | William | Dry | STU | 8
175 | Nathan |Gong | OTH | 10
175 |Nathan |Gong | FMRSTU | 13
175 |Nathan | Gon | FR | 14
308 | Bridget | Abad | PAR | 3
308 | Bridget | Abad | EMP | 7
370 | Matt | Bodie | BD | 1
370 | Matt | Bodie | AL | 2
What I need is a file that has all the codes associated with one person on the same row in the order of their importance.
I want to end up with something that looks like this:
ID |FIRST |LAST |CODE1 |CODE2 |CODE3 |CODE4
116 |Jamie |Ansto |PARAL |FMREMP
153 |Alan |Rond |PAR |PARAL
155 |Maureen |Aron | GP | PARAL
381 |William |Dry |STU
175 |Nathan |Gong |OTH |FMRSTU |FR
308 | Bridget |Abad |PAR |EMP
370 | Matt |Bodie |BD | AL
I tried using Group_Concat but it didn't give me the results in the order I wanted. Any help would be appreciated.
Thanks,
MG
You can do something like this:
SELECT *,GROUP_CONCAT(`ROLE` ORDER BY `ORDER_OF_IMPORTANCE` SEPARATOR ' ' )
FROM `table1` GROUP BY `ID`;
The SEPARATOR ' ' function will give you result like this OTH FMRSTU FR. If you remove it and only do GROUP_CONCAT(ROLE ORDER BY ORDER_OF_IMPORTANCE), the result will look like this OTH,FMRSTU,FR instead.

SQL Count Distinct Using Multiple Unique Identifiers

My company ran a series of TV ads and we're measuring the impact by changes in our website traffic. I would like to determine the cost per session we saw generated, based on the cost of each ad.
The trouble is, the table this is referencing has duplicate data, so my currently cost_per_session isn't counting right.
What I have so far:
client_net_cleared = cost of ad
ad_time, media_outlet, & program = combined are a unique identifier for each ad
diff = assumed sessions generated by ad
.
SELECT DISTINCT tadm.timestamp AS ad_time
, tadm.media_outlet AS media_outlet
, tadm.program AS program
, tadm.client_net_cleared AS client_net_cleared
, SUM(tadm.before_ad_sum) AS before_ad_sessions
, SUM(tadm.after_ad_sum) AS after_ad_sessions
, (SUM(tadm.after_ad_sum) - SUM(tadm.before_ad_sum)) AS diff
, CASE WHEN tadm.client_net_cleared = 0 THEN null
WHEN (SUM(tadm.after_ad_sum) - SUM(tadm.before_ad_sum)) <1 THEN null
ELSE (tadm.client_net_cleared/(SUM(tadm.after_ad_sum) - SUM(tadm.before_ad_sum)))
END AS cost_per_session
FROM tableau.km_tv_ad_data_merged tadm
GROUP BY ad_time,media_outlet,program,client_net_cleared
Sample data:
ad_time | media_outlet | program | client_net_cleared | before_ad_sessions | after_add_sessions | diff | cost_per_session
---------------------|---------------|----------------|--------------------|--------------------|--------------------|------|-----------------
2016-12-09 22:55:00 | DIY | | 970 | 55 | 72 | 17 | 57.05
2016-12-11 02:22:00 | E! | E! News | 388 | 25 | 31 | 6 | 64.66
2016-12-19 21:15:00 | Cooking | The Best Thing | 428 | 70 | 97 | 27 | 15.85
2016-12-22 14:01:00 | Oxygen | Next Top Model | 285 | 95 | 148 | 53 | 5.37
2016-12-09 22:55:00 | DIY | | 970 | 55 | 72 | 17 | 57.05
2016-12-04 16:13:00 | Headline News | United Shades | 1698 | 95 | 137 | 42 | 40.42
What I need:
Only count one instance of each ad when calculating cost_per_session.
EDIT: Fixed the query, had a half completed row where I was failing at doing this before asking the question. :)
Get rid of the DISTINCT in SELECT DISTINCT in the first line of your query. It makes no sense in a GROUP BY query.
If your rows are entirely duplicate, try deduplicating the table before you put it into the GROUP BY grinder by replacing
FROM tableau.km_tv_ad_data_merged tadm
with
FROM ( SELECT DISTINCT timestamp, media_outlet, program,
client_net_cleared,
before_ad_sum, after_ad_sum
FROM tableau.km_tv_ad_data_merged
) tadm

what is the logic to represent sub items from a box in an stock(Warehouse) database?

For ex.: The process of buy a rice bag of 100 kg, give one entry for the completely bag, and then give out 20 or 30 kilogram of that bag. How to achieve it in the stock database.
The structure can be different based on the behaviour of your application. There is no absolute way. But, I give you an example and you can find an idea:
inventory table:
id
date
merchandise_id
amount (negative for exit and positive for entry)
inventory_id (this is null for entry and includes the id of entry for exits)
Sample data:
id | date | merchandise_id | amount | inventory_id
-----------------------------------------------------------------
1 | 2016-06-01 | 32 | 100 | NULL
2 | 2016-06-03 | 32 | -20 | 1
3 | 2016-06-04 | 32 | -30 | 1

Best way to gain performance and do fast sql queries?

I use MySQL for my database and i do some processing on the database side to make it easier for my application.
The queries i do used to be very fast until recently my database has lots of data and the queries are very very very slow.
My application do mainly statistics and has lots of related database to fetch data.
Here is an example:
tbl_game
+-------------------------------------+
| id | winner | duration| endedAt |
|--------+--------+---------+---------|
| 1 | 1 | 1200 |timestamp|
| 2 | 0 | 1200 |timestamp|
| 3 | 1 | 1200 |timestamp|
| 4 | 1 | 1200 |timestamp|
+-------------------------------------+
winner is either 0 or 1 for the team who won the game
duration is the number of seconds a game took
tbl_game_player
+-------------------------------------------------+
| gameId | playerId | playerSlot | frags | deaths |
|--------+----------+------------+-------+--------|
| 1 | 100 | 1 | 24 | 50 |
| 1 | 150 | 2 | 32 | 52 |
| 1 | 101 | 3 | 26 | 62 |
| 1 | 109 | 4 | 48 | 13 |
| 1 | 123 | 5 | 24 | 52 |
| 1 | 135 | 6 | 30 | 30 |
| 1 | 166 | 7 | 28 | 48 |
| 1 | 178 | 8 | 52 | 96 |
| 1 | 190 | 9 | 12 | 75 |
| 1 | 106 | 10 | 68 | 25 |
+-------------------------------------------------+
The details are only for the first game with id 1
1 game has 10 player slots where slot 1-5 = team 0 and 6-10 = team 1
There are more details in my real table this is just to give an overview.
So i need to calculate the statistics of each player in all the games. I created a view to accomplish this and it works fine when i have little data.
Here is an example:
+--------------------------------------------------------------------------+
| gameId | playerId | frags | deaths | actions | team | percent | isWinner |
|--------+----------+-------+--------+---------+------+---------+----------|
actions = frags + deaths
percent = (actions / sum(actions of players in the same team)) * 100
team is calculated using playerSlot in 1,2,3,4,5 or 6,7,8,9,10
isWinner is calculated by the team and winner
This is just 1 algorithm and i have many others to perform. My database is 1 milion + records and the queries are very slow.
here is the query for the above:
SELECT
tgp.gameId,
tgp.playerId,
tgp.frags,
tgp.deaths,
tgp.frags + tgp.deaths AS actions,
IF(playerSlot in (1,2,3,4,5), 0, 1) AS team,
((SELECT actions) / tgpx.totalActions) * 100 AS percent,
IF((SELECT team) = tg.winner, 1, 0) AS isWinner
FROM tbl_game_player tgp
INNER JOIN tbl_game tg on tgp.gameId = tg.id
INNER JOIN (
SELECT
gameId,
SUM(frags) AS totalFrags,
SUM(deaths) AS totalDeaths,
SUM(frags) + SUM(deaths) as totalActions,
IF(playerSlot in (1,2,3,4,5), 0, 1) as team
FROM tbl_game_player
GROUP BY gameId, team
) tgpx on tgp.gameId = tgpx.gameId and team = tgpx.team
It's quite obvious that indexes don't help you here¹, because you want all data from the two tables. You even want the data from tbl_game_player twice, once aggregated, once not aggregated. So there are millions of records to read and join. Your query is fine, and I see no way to improve it really.
¹ Of course you should always have indexes on primary and foreign keys, so the DBMS can make use of them in joins. (E.g. there should be an index on tbl_game(tgp.gameId)).
So your options lie outside the query:
Hardware (obviously).
Add a computed column for the team to tbl_game_player, so at least you save its evaluation when querying.
Partitions. One partition per team, so the aggregates can be calcualted separately.
Pre-computed data: Add a table tbl_game_team holding the sums; fill it with triggers. Thus you don't have to compute the aggregates in your query.
Data warehouse table: Make a table holding the complete result. Fill it with triggers or at intervals.
Setting up indexes would speed up your queries. Queries can take a while to run if there is a lot of results, this is definitely a start though.
for large databases Mysql INDEX can be very helpful in speed problems, An index can be created in a table to find data more quickly & efficiently. so must create index , you can learn more about MYsql index here http://www.w3schools.com/sql/sql_create_index.asp

Duplication/Updation of data in mysql

I want to get the sum of the data for every 5 minutes.
I have 15 motes.
for ,suppose in the first 5 minutes only some motes are queried and in the next 5 minutes other some motes are queried.
Now,In the second 5 minutes I need the data of the motes which are not queried in that 5minutes also
ie.,in the first 5minutes moteid's 1,2,3,4,9,12,14 are queried and in the second minutes moteid's 1,5,6,7,9,13,14 are queried.
In the second 5 minutes,I need the data to be updated for the one's which are not queried also.Is it possible to get the data from the previous 5 minutes
moteid2 | 28 | 2012-09-25 17:45:43 | |
moteid4 | 65 | 2012-09-25 17:45:49 | |
moteid3 | 66 | 2012-09-25 17:45:51 | |
moteid6 | 25 | 2012-09-25 17:45:56 | |
moteid5 | 29 | 2012-09-25 17:45:58 | |
moteid7 | 30 | 2012-09-25 17:46:05 | |
moteid4 | 95 | 2012-09-25 17:50:29 | |
moteid6 | 56 | 2012-09-25 17:50:35 | |
moteid5 | 58 | 2012-09-25 17:50:36 | |
moteid4 | 126 | 2012-09-25 17:55:08 |
In the first 5 minutes moteid2, moteid3 are queried, but after that in the next 5minutes they are not queried. Even If they are not being queried i want the same previous queried value to be kept now.
I'm assuming the table name is motes. In this case the following query displays all unique motesid for the records which present in the whole table but were not queried in last 5 minutes:
select distinct m.motesid
from motes m
where not exists (
select *
from motes m1
where
m1.moteid = m.motesid and
m1.date > SUBTIME(CURTIME(), '0:05:00')
)