SQL Count Distinct Using Multiple Unique Identifiers - mysql

My company ran a series of TV ads and we're measuring the impact by changes in our website traffic. I would like to determine the cost per session we saw generated, based on the cost of each ad.
The trouble is, the table this is referencing has duplicate data, so my currently cost_per_session isn't counting right.
What I have so far:
client_net_cleared = cost of ad
ad_time, media_outlet, & program = combined are a unique identifier for each ad
diff = assumed sessions generated by ad
.
SELECT DISTINCT tadm.timestamp AS ad_time
, tadm.media_outlet AS media_outlet
, tadm.program AS program
, tadm.client_net_cleared AS client_net_cleared
, SUM(tadm.before_ad_sum) AS before_ad_sessions
, SUM(tadm.after_ad_sum) AS after_ad_sessions
, (SUM(tadm.after_ad_sum) - SUM(tadm.before_ad_sum)) AS diff
, CASE WHEN tadm.client_net_cleared = 0 THEN null
WHEN (SUM(tadm.after_ad_sum) - SUM(tadm.before_ad_sum)) <1 THEN null
ELSE (tadm.client_net_cleared/(SUM(tadm.after_ad_sum) - SUM(tadm.before_ad_sum)))
END AS cost_per_session
FROM tableau.km_tv_ad_data_merged tadm
GROUP BY ad_time,media_outlet,program,client_net_cleared
Sample data:
ad_time | media_outlet | program | client_net_cleared | before_ad_sessions | after_add_sessions | diff | cost_per_session
---------------------|---------------|----------------|--------------------|--------------------|--------------------|------|-----------------
2016-12-09 22:55:00 | DIY | | 970 | 55 | 72 | 17 | 57.05
2016-12-11 02:22:00 | E! | E! News | 388 | 25 | 31 | 6 | 64.66
2016-12-19 21:15:00 | Cooking | The Best Thing | 428 | 70 | 97 | 27 | 15.85
2016-12-22 14:01:00 | Oxygen | Next Top Model | 285 | 95 | 148 | 53 | 5.37
2016-12-09 22:55:00 | DIY | | 970 | 55 | 72 | 17 | 57.05
2016-12-04 16:13:00 | Headline News | United Shades | 1698 | 95 | 137 | 42 | 40.42
What I need:
Only count one instance of each ad when calculating cost_per_session.
EDIT: Fixed the query, had a half completed row where I was failing at doing this before asking the question. :)

Get rid of the DISTINCT in SELECT DISTINCT in the first line of your query. It makes no sense in a GROUP BY query.
If your rows are entirely duplicate, try deduplicating the table before you put it into the GROUP BY grinder by replacing
FROM tableau.km_tv_ad_data_merged tadm
with
FROM ( SELECT DISTINCT timestamp, media_outlet, program,
client_net_cleared,
before_ad_sum, after_ad_sum
FROM tableau.km_tv_ad_data_merged
) tadm

Related

Get Number of A's in Result Table - MySQL

This is the case. In my school all classes prepare excel sheet for each class with marks for each subject in term end test. There are 17 classes. I combine them in to access table. Then again export all data in to excel. make csv file . And import to Mysql Database using phpmyadmin. now I have result table as follow.
| ID | Name | Religion | Sinhala | science | english | maths | History | Categery 1 | Categery 2 | Categery 3 | Total | Average | Rank | |
|---- |------- |---------- |--------- |--------- |--------- |------- |--------- |------------ |------------ |------------ |------- |--------- |------ |--- |
| 1 | manoj | 45 | 65 | 78 | 98 | 67 | 67 | 63 | 76 | 64 | 654 | 62 | 12 | |
Sectional Head Need to get number of students who got >75 for all Subject.
And Number of Student Who got >75 for 8 subject out of 9.
I need to retrieve number of A s, B s (marks >=75) from this table.
Ex. Student names and Number of A s
Total Number of A for all 9 subject - 45
Total Number of A for all 8 subject (any 8 subject ) - 45
Total Number of A for all 7 subject (any 7 subject ) - 45
I Tried following SQL Statement
SELECT COUNT(SELECT COUNT()
FROM result
WHERE religion >=75
AND Math >=75)
FROM result
I read about same scenario in stack overflow.
Access 2010
this one get some point. but I cant solve it for my scenario.
Use GROUP BY studentName and SUM(grade = 'A') AS numberOfAs.
[Quick answer bc question is quickly formatted]

MySQL, Determine value associated with MAX() of another value using GROUP BY [duplicate]

This question already has answers here:
SQL select only rows with max value on a column [duplicate]
(27 answers)
Closed 6 years ago.
I have a MySQL database that contains the table, "message_route". This table tracks the the path between hubs a message from a device takes before it finds a modem and goes out to the internet.
"message_route" contains the following columns:
id, summary_id, device_id, hub_address, hop_count, event_time
Each row in the table represents a single "hop" between two hubs. The column "device_id" gives the id of the device the message originated from. The column "hub_address" gives the id of the hub the message hop was received by, and "hop_count" counts these hops incrementally. The full route of the message is bound together by the "summary_id" key. A snippet of the table to illustrate:
+-----+------------+-----------+-------------+-----------+---------------------+
| id | summary_id | device_id | hub_address | hop_count | event_time |
+-----+------------+-----------+-------------+-----------+---------------------+
| 180 | 158 | 1099 | 31527 | 1 | 2011-10-01 04:50:53 |
| 181 | 159 | 1676 | 51778 | 1 | 2011-10-01 00:12:04 |
| 182 | 159 | 1676 | 43567 | 2 | 2011-10-01 00:12:04 |
| 183 | 159 | 1676 | 33805 | 3 | 2011-10-01 00:12:04 |
| 184 | 160 | 2326 | 37575 | 1 | 2011-10-01 00:12:07 |
| 185 | 160 | 2326 | 48024 | 2 | 2011-10-01 00:12:07 |
| 186 | 160 | 2326 | 57652 | 3 | 2011-10-01 00:12:07 |
+-----+------------+-----------+-------------+-----------+---------------------+
There are three total messages here. The message with summary_id = 158 touched only one hub before finding a modem, so row with id = 180 is the entire record of that message. Summary_ids 159 and 160 each have 3 hops, each touching 3 different hubs. There is no upward limit of the number of hops a message can have.
I need to create a MySQL query that gives me a list of the unique "hub_address" values that constitute the last hop of a message. In other words, the hub_address associated with the maximum hop_count for each summary_id. With the database snippet above, the output should be "31527, 33805, 57652".
I have been unable to figure this out. In the meantime, I am using this code as a proxy, which only gives me the unique hub_address values for messages with a single hop, such as summary_id = 158.
SELECT DISTINCT(x.hub_address)
FROM (SELECT hub_address, COUNT(summary_id) AS freq
FROM message_route GROUP BY summary_id) AS x
WHERE x.freq = 1;
I would approach this as:
select distinct mr.hub_address
from message_route mr
where mr.event_time = (select max(mr2.event_time)
from message_route mr2
where mr2.summary_id = mr.summary_id
);

Best way to gain performance and do fast sql queries?

I use MySQL for my database and i do some processing on the database side to make it easier for my application.
The queries i do used to be very fast until recently my database has lots of data and the queries are very very very slow.
My application do mainly statistics and has lots of related database to fetch data.
Here is an example:
tbl_game
+-------------------------------------+
| id | winner | duration| endedAt |
|--------+--------+---------+---------|
| 1 | 1 | 1200 |timestamp|
| 2 | 0 | 1200 |timestamp|
| 3 | 1 | 1200 |timestamp|
| 4 | 1 | 1200 |timestamp|
+-------------------------------------+
winner is either 0 or 1 for the team who won the game
duration is the number of seconds a game took
tbl_game_player
+-------------------------------------------------+
| gameId | playerId | playerSlot | frags | deaths |
|--------+----------+------------+-------+--------|
| 1 | 100 | 1 | 24 | 50 |
| 1 | 150 | 2 | 32 | 52 |
| 1 | 101 | 3 | 26 | 62 |
| 1 | 109 | 4 | 48 | 13 |
| 1 | 123 | 5 | 24 | 52 |
| 1 | 135 | 6 | 30 | 30 |
| 1 | 166 | 7 | 28 | 48 |
| 1 | 178 | 8 | 52 | 96 |
| 1 | 190 | 9 | 12 | 75 |
| 1 | 106 | 10 | 68 | 25 |
+-------------------------------------------------+
The details are only for the first game with id 1
1 game has 10 player slots where slot 1-5 = team 0 and 6-10 = team 1
There are more details in my real table this is just to give an overview.
So i need to calculate the statistics of each player in all the games. I created a view to accomplish this and it works fine when i have little data.
Here is an example:
+--------------------------------------------------------------------------+
| gameId | playerId | frags | deaths | actions | team | percent | isWinner |
|--------+----------+-------+--------+---------+------+---------+----------|
actions = frags + deaths
percent = (actions / sum(actions of players in the same team)) * 100
team is calculated using playerSlot in 1,2,3,4,5 or 6,7,8,9,10
isWinner is calculated by the team and winner
This is just 1 algorithm and i have many others to perform. My database is 1 milion + records and the queries are very slow.
here is the query for the above:
SELECT
tgp.gameId,
tgp.playerId,
tgp.frags,
tgp.deaths,
tgp.frags + tgp.deaths AS actions,
IF(playerSlot in (1,2,3,4,5), 0, 1) AS team,
((SELECT actions) / tgpx.totalActions) * 100 AS percent,
IF((SELECT team) = tg.winner, 1, 0) AS isWinner
FROM tbl_game_player tgp
INNER JOIN tbl_game tg on tgp.gameId = tg.id
INNER JOIN (
SELECT
gameId,
SUM(frags) AS totalFrags,
SUM(deaths) AS totalDeaths,
SUM(frags) + SUM(deaths) as totalActions,
IF(playerSlot in (1,2,3,4,5), 0, 1) as team
FROM tbl_game_player
GROUP BY gameId, team
) tgpx on tgp.gameId = tgpx.gameId and team = tgpx.team
It's quite obvious that indexes don't help you here¹, because you want all data from the two tables. You even want the data from tbl_game_player twice, once aggregated, once not aggregated. So there are millions of records to read and join. Your query is fine, and I see no way to improve it really.
¹ Of course you should always have indexes on primary and foreign keys, so the DBMS can make use of them in joins. (E.g. there should be an index on tbl_game(tgp.gameId)).
So your options lie outside the query:
Hardware (obviously).
Add a computed column for the team to tbl_game_player, so at least you save its evaluation when querying.
Partitions. One partition per team, so the aggregates can be calcualted separately.
Pre-computed data: Add a table tbl_game_team holding the sums; fill it with triggers. Thus you don't have to compute the aggregates in your query.
Data warehouse table: Make a table holding the complete result. Fill it with triggers or at intervals.
Setting up indexes would speed up your queries. Queries can take a while to run if there is a lot of results, this is definitely a start though.
for large databases Mysql INDEX can be very helpful in speed problems, An index can be created in a table to find data more quickly & efficiently. so must create index , you can learn more about MYsql index here http://www.w3schools.com/sql/sql_create_index.asp

MySQL Count within an IF

+-------------+--------------+----------+-------+
| ticketRefNo | nameOnTicket | boughtBy | event |
+-------------+--------------+----------+-------+
| 38 | J XXXXXXXXX | 2 | 13 |
| 39 | C YYYYYYY | 1 | 13 |
| 40 | M ZZZZZZZZZZ | 3 | 14 |
| 41 | C AAAAAAA | 3 | 15 |
| 42 | D BBBBBB | 3 | 16 |
| 43 | A CCCCC | 3 | 17 |
+-------------+--------------+----------+-------+
+-------------+------------------+--------------+---------------------+--------+
| ticketRefNo | cardNo | cardHolder | exp | issuer |
+-------------+------------------+--------------+---------------------+--------+
| 38 | 4444111133332222 | J McKenny | 2016-01-01 00:00:00 | BOS |
| 39 | 4434111133332222 | C Dempsey | 2016-04-01 00:00:00 | BOS |
| 40 | 4244111133332222 | M Gunn-Davis | 2018-02-01 00:00:00 | RBS |
+-------------+------------------+--------------+---------------------+--------+
+-------------+-------------+----------+
| ticketRefNo | boxOfficeID | paidWith |
+-------------+-------------+----------+
| 41 | 1 | card |
| 42 | 2 | cash |
| 43 | 3 | chequ |
+-------------+-------------+----------+
I have a database with the data shown above. It represents a ticket-buying system. I would like to be able to see a list of tickets bought with the name of the event and either the boxOfficeID or the issuer of the debit card.
I have tried running the following code, to no avail.
SELECT t.ticketRefNo AS 'Reference', t.event AS 'Event',
IF(COUNT(SELECT * FROM Online WHERE t.ticketRefNo=o.ticketRefNo;) >= 1,
o.issuer, InPerson.boxOfficeID) AS 'Card Issuer or Box Office'
FROM Ticket AS t, InPerson, Online AS o
WHERE t.ticketRefNo=o.ticketRefNo;
Cheers in advance!
Some notes: the semicolon character isn't valid syntax; if you have a need to delimit the subquery, wrap it in parens. Escape column aliases like you'd escape any other identifier: use backticks, not single quotes. Single quotes are used around string literals.
Assuming that issuer in the Online table is NOT NULL, and assuming that ticketRefNo is unique in both the Online and InPerson tables, you could do something like this:
SELECT t.ticketRefNo AS `Reference`
, t.event AS `Event`
, IF(o.ticketRefNo IS NOT NULL,o.issuer,i.boxOfficeId)
AS `Card Issuer or Box Office`
FROM Ticket t
LEFT
JOIN InPerson i
ON i.ticketRefNo = t.ticketRefNo
LEFT
JOIN Online o
ON o.ticketRefNo = t.ticketRefNo
Use outer join operations to find matching rows in the InPerson and Online tables, and use a conditional test to see if you got a matching row from the Online table. A NULL will be returned if there wasn't a matching row found.
It's not a good idea to have one column JOINing to two different tables with some values in each of the two tables.
But here goes anyway:
( SELECT ... FROM Ticket t JOIN InPerson x USING(ticketRefNo) ... )
UNION ALL
( SELECT ... FROM Ticket t JOIN Online x USING(ticketRefNo) ... )
ORDER BY ...
The ALL assumes that InPerson and Online never have any overlapping ticketRefNos.
The ORDER BY an the end is in case you want to sort things, although I see no need for it in your attempted SELECT.
The two SELECTs must have the same number of columns.

Convert Mysql Query to Rails ActiveRecord Query Without using find_by_sql

I have table named questions like follows
+----+---------------------------------------------------------+----------+
| id | title | category |
+----+---------------------------------------------------------+----------+
| 89 | Tinker or work with your hands? | 2 |
| 54 | Sketch, draw, paint? | 3 |
| 53 | Express yourself clearly? | 4 |
| 77 | Keep accurate records? | 6 |
| 32 | Efficient? | 6 |
| 52 | Make original crafts, dinners, school or work projects? | 3 |
| 70 | Be elected to office or make your opinions heard? | 5 |
| 78 | Take photographs? | 3 |
| 84 | Start your own political campaign? | 5 |
| 9 | Free spirit or a rebel? | 3 |
| 38 | Lead a group? | 5 |
| 71 | Work in groups? | 4 |
| 2 | Helpful? | 4 |
| 4 | Mechanical? | 6 |
| 14 | Responsible? | 6 |
| 66 | Pitch a tent, an idea? | 1 |
| 62 | Write useful business letters? | 5 |
| 28 | Creative? | 3 |
| 68 | Perform experiments? | 2 |
| 10 | Like to figure things out? | 2 |
+----+---------------------------------------------------------+----------+
I have a sql query to get one random record from each category.Can any one convert the mysql query to rails activerecord query(with out using Question.find_by_sql).This mysql query is working absolutely fine but I need only active record query because of my dependency in further steps.
Here is mysql query
SELECT t.id, title as question, category
FROM
(
SELECT
(
SELECT id
FROM questions
WHERE category = t.category
ORDER BY RAND()
LIMIT 1
) id
FROM questions t
GROUP BY category
) q JOIN questions t
ON q.id = t.id
Thank You for your consideration!
When things get crazy one have to reach out for Arel:
It is intended to be a framework framework; that is, you can build
your own ORM with it, focusing on innovative object and collection
modeling as opposed to database compatibility and query generation.
So what we want to do is to let Arel create the query for us. Moreover the approach here is gonna be used: the questions table is left joined with randomized version of itself:
q_normal = Arel::Table.new("questions")
q_random = Arel::Table.new("questions").project(Arel.sql("*")).order("RAND()").as("q2")
Time to left join
query = q_normal.join(q_random, Arel::Nodes::OuterJoin).on(q_normal[:category].eq(q_random[:category])).group(q_normal[:category]).order(q_random[:category])
Now you can use which columns you want using project, e.g.:
query.project(q_normal[:id])
The only way I can think of to do this requires a good bit of application code. I don't think there's a way of accessing the RAND() functionality in MySQL (or equivalent in other DB technologies) using ActiveRecord. Here's what I came up with:
counts = Question.group(:category_id).count(:id)
offsets = {}
counts.each do |cat_id, count|
offsets[cat_id] = rand(count)
end
random_questions = []
offsets.each do |cat_id, offset|
random_questions.push(Question.where(:category_id => cat_id).offset(offset).first)
end