Select Count from a table with 20 000 000 rows

Select Count from a table with 20 000 000 rows - mysql

I have an online rugby manager game. Every registered user has one team, and each team has 25 players at the beginning. There are friendly, league, cup matches.
I want to show for each player page the number of:
official games played,
tries,
conversions,
penalities,
dropgoals
and for each of the categories:
in this season;
in his career;
for national team;
for U-20 national team
I have two options:
$query = "SELECT id FROM tbl_matches WHERE type='0' AND standing='1'";
$query = mysql_query($query, $is_connected) or die('Can\'t connect to database.');
while ($match = mysql_fetch_array($query)) {
$query2 = "SELECT COUNT(*) as 'number' FROM tbl_comm WHERE matchid='$match[id]' AND player='$player' and result='5'";
$query2 = mysql_query($query2, $is_connected) or die('Can\'t connect to database.');
$try = mysql_fetch_array($query2);
}
This script searches every official match played by the selected player. Then gets the report for that match (about 20 commentary lines for every match) and check every line if the searched player had scored a try.
The problem is that in a few seasons there could be about 20 000 000 row in commentary page. Will my script load slowly (notable by users)?
The second option is to create a table with player stats, who will have about 21 cols.
What do you recommend that I do?

Why do all of those separate queries when you can just group them all together and get your count by id:
select tbl_matches.id,count(*)
from tbl_matches join tbl_comm on tbl_matches.id = tbl_comm.matchid
where tbl_comm.player = '$player'
and tbl_comm.result = '5'
and tbl_matches.type='0'
and tbl_matches.standing='1'
group by tbl_matches.id;
If you need additional columns, just add them to both the select columns and the group by column list.
Also: you should be extremely wary about substituting $player directly into your query. If that string isn't properly escaped, that could be the source of a SQL-injection attack.
EDIT: fixed query per Jonathan's comment

Related

Relational Database Logic

I'm fairly new to php / mysql programming and I'm having a hard time figuring out the logic for a relational database that I'm trying to build. Here's the problem:
I have different leaders who will be in charge of a store anytime between 9am and 9pm.
A customer who has visited the store can rate their experience on a scale of 1 to 5.
I'm building a site that will allow me to store the shifts that a leader worked as seen below.
When I hit submit, the site would take the data leaderName:"George", shiftTimeArray: 11am, 1pm, 6pm (from the example in the picture) and the shiftDate and send them to an SQL database.
Later, I want to be able to get the average score for a person by sending a query to mysql, retrieving all of the scores that that leader received and averaging them together. I know the code to build the forms and to perform the search. However, I'm having a hard time coming up with the logic for the tables that will relate the data. Currently, I have a mysql table called responses that contains the following fields,
leader_id
shift_date // contains the date that the leader worked
shift_time // contains the time that the leader worked
visit_date // contains the date that the survey/score was given
visit_time // contains the time that the survey/score was given
score // contains the actual score of the survey (1-5)
I enter the shifts that the leader works at the beginning of the week and then enter the survey scores in as they come in during the week.
So Here's the Question: What mysql tables and fields should I create to relate this data so that I can query a leader's name and get the average score from all of their surveys?

You want tables like:
Leader (leader_id, name, etc)
Shift (leader_id, shift_date, shift_time)
SurveyResult (visit_date, visit_time, score)
Note: omitted the surrogate primary keys for Shift and SurveyResult that I would probably include.
To query you join shifts and surveys group on leader and taking the average then jon that back to leader for a name.
The query might be something like (but I haven;t actually built it in MySQL to verify syntax)
SELECT name
,AverageScore
FROM Leader a
INNER JOIN (
SELECT leader_id
, AVG(score) AverageScore
FROM Shift
INNER JOIN
SurveyResult ON shift_date = visit_date
AND shift_time = visit_time --depends on how you are recording time what this really needs to be
GROUP BY leader ID
) b ON a.leader_id = b.leader_id

I would do the following structure:
leaders
id
name
leaders_timetabke (can be multiple per leader)
id,
leader_id
shift_datetime (I assume it stores date and hour here, minutes and seconds are always 0
survey_scores
id,
visit_datetime
score
SELECT l.id, l.name, AVG(s.score) FROM leaders l
INNER JOIN leaders_timetable lt ON lt.leader_id = l.id
INNER JOIN survey_scores s ON lt.shift_datetime=DATE_FORMAT('Y-m-d H:00:00', s.visit_datetime)
GROUP BY l.id
DATE_FORMAT here helps to cut hours and minutes from visit_datetime so that it could be matched against shift_datetime. This is MYSQL function, so if you use something else you'll need to use different function

Say you have a 'leader' who has 5 survey rows with scores 1, 2, 3, 4 and 5.
if you select all surveys from this leader, sum the survey scores and divide them by 5 (the total amount of surveys that this leader has). You will have the average, in this case 3.
(1 + 2 + 3 + 4 + 5) / 5 = 3
You wouldn't need to create any more tables or fields, you have what you need.

Get entry with max value in MySQL

I've got a MySQL database with lots of entris of highscores for a game. I would like to get the "personal best" entry with the max value of score.
I found a solution that I thought worked, until I got more names in my database, then it returnes completely different results.
My code so far:
SELECT name, score, date, version, mode, custom
FROM highscore
WHERE score =
(SELECT MAX(score)
FROM highscore
WHERE name = 'jonte' && gamename = 'game1')
For a lot of values, this actually returns the correct value as such:
JONTE 240 2014-04-28 02:52:33 1 0 2053
It worked fine with a few hundred entries, some with different names. But when I added new entries and swapped name to 'gabbes', for the new names I instead get a list of multiple entries. I don't see the logic here as the entries in the database seem quite identical with some differences in data.
JONTE 176 2014-04-28 11:03:46 1 0 63
GABBES 176 2014-04-28 11:09:12 1 0 3087
The above has two entires, but sometimes it may also return 10-20 entries in a row too.
Any help?

If you want the high score for each person (i.e. personal best) you can do this...
SELECT name, max(score)
FROM highscore
WHERE gamename = 'game1'
GROUP BY name
Alternatively, you can do this...
SELECT name, score, date, version, mode, custom
FROM highscore h1
WHERE score =
(SELECT MAX(score)
FROM highscore h2
WHERE name = h1.name && gamename = 'game1')
NOTE: In your SQL, your subclause is missing the name = h1.name predicate.
Note however, that this second option will give multiple rows for the same person if they recorded the same high score multiple times.

The multiple entries are returned because multiple entries have the same high score. You can add LIMIT 1 to get only a single entry. You can choose which entry to return with the ORDER BY clause.

Calculate a variable using 2 Mysql tables and make a select based on that variable

I own an online game in which you become the coach of a rugby team and I recently started to optimize my database. The website uses CodeIgniter framework.
I have the following tables (the tables have more fields but I posted only those which are important now):
LEAGUES: id
STANDINGS: league_id, team_id, points
TEAMS: id, active
Previously, I was having in the LEAGUES table a field named teams. This was representing the number of active teams in that league (of which users logged in recently).
So, I was doing the following select to get a random league that has between 0 and 4 active teams (leagues with less teams first).
SELECT id FROM LEAGUES WHERE teams>0 AND teams<4 ORDER BY teams ASC, RAND( ) LIMIT 1
Is there any way I can do the same command now without having to add the teams field?
Is it efficient? Or It's better to keep the teams field in the database?
LATER EDIT
This is what I did until now:
function test()
{
$this->db->select('league_id, team_id');
$this->db->join('teams', 'teams.id = standings.team_id');
$this->db->where('active', 0);
$query = $this->db->get('standings');
return $query->result_array();
}
The function returns all inactive teams alongside with their league_id.
Now how do I count the number of inactive teams in each league and how to I sort them after this number?

Try this:
select league_id
from standings s
join teams t on t.id = s.team_id and t.active
group by 1
having count(*) < 5

Very slow execution of MySql update

I have this SQL:
UPDATE products pr
SET pr.product_model_id = (SELECT id FROM product_models pm WHERE pm.category_id = 1 ORDER BY rand() LIMIT 1)
limit 200;
It took the mysql-server more then 15 seconds for these 200 records. and in the tablem there are 220,000 records.
why is that?
edit:
I have these tables which are empty, but I need to fill them with random information for testing.
True estimations shows that I will have:
80 categories
40,000 models
And, around 500,000 products
So, I've manually created:
ALL the categories.
200 models (and used sql to duplicate them to 20k).
200 products (and duplicated them to 250k)
I need them all attached.
DB tables are:
categories {id, ...}
product_models {id, category_id, ...}
products {id, product_model_id, category_id}

Although question seems to be little odd but here is a quick thought about the problem.
RAND function doesn't perform well on large data-set.
In mysql, developer try to achieve this in different ways, check these posts:
How can i optimize MySQL's ORDER BY RAND() function?
http://www.titov.net/2005/09/21/do-not-use-order-by-rand-or-how-to-get-random-rows-from-table/
One of the quick way is following(in php):
//get the total number of row
$result= mysql_query("SELECT count(*) as count
FROM product_models pm WHERE pm.category_id = 1 ");
$row = mysql_fetch_array($result);
$total=$row['count'];
//create random value from 1 to the total of rows
$randomvalue =rand(1,$total);
//get the random row
$result= mysql_query("UPDATE products pr
SET pr.product_model_id =
(SELECT id FROM product_models pm
WHERE pm.category_id = 1
LIMIT $randomvalue,1)
limit 200");
Hope this will help.

The 'problem' is the ORDER BY rand()

MySQL - Optimize a Query and Find a rank based on column Sum

I have a high score database for a game that tracks every play in a variety of worlds. What I want to do is find out some statistics on the plays, and then find where each world "ranks" according to each other world (sorted by number of times played).
So far I've got all my statistics working fine, however I've run into a problem finding the ranking of each world.
I'm also pretty sure doing this in three separate queries is probably a very slow way to go about this and could probably be improved.
I have a timestamp column (not used here) and the "world" column indexed in the DB schema. Here's a selection of my source:
function getStast($worldName) {
// ## First find the number of wins and some other data:
$query = "SELECT COUNT(*) AS total,
AVG(score) AS avgScore,
SUM(score) AS totalScore
FROM highscores
WHERE world = '$worldName'
AND victory = 1";
$win = $row['total'];
// ## Then find the number of losses:
$query = "SELECT COUNT(*) AS total
FROM highscores
WHERE world = '$worldName'
AND victory = 0";
$loss = $row['total'];
$total = $win + $loss;
// ## Then find the rank (this is the broken bit):
$query="SELECT world, count(*) AS total
FROM highscores
WHERE total > $total
GROUP BY world
ORDER BY total DESC";
$rank = $row['total']+1;
// ## ... Then output things.
}
I believe the specific line of code that's failing me is in the RANK query,
WHERE total > $total
Is it not working because it can't accept a calculated total as an argument in the WHERE clause?
Finally, is there a more efficient way to calculate all of this in a single SQL query?

I think you might want to use 'having total > $total'?
SELECT world, count(*) AS total
FROM highscores
GROUP BY world
having total > $total
ORDER BY total DESC

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008