Search for the existance of many objects in a database - mysql

Say I have a database with 5 million users, with the columns
id (unsigned int, auto-increment), facebook_id (unsigned int), and name (varchar)
In a program, I have a list of a variable amount of users from a person's facebook friend list (generally ranging from 500-1200 different facebook ids).
What's the most efficient way to send a query to my database that returns the facebook_id's of all of the users where that same facebook_id exists in the database?
Pseudo-code:
$friends = array(12345, 22345, 32345, 42345, 52345, ... ~1000 more);
$q = mysql_query("SELECT * FROM users ...");
$friendsAlreadyUsingApp = parseQuery($q);

This is a topic of almost an endless number of articles, blogs, Q&As etc; and the essence of this problem is that it looks really simple - but isn't.
The heart of the problem is that the parameters looks like it should work using WHERE field IN() BUT it does not do that because the parameter is a single string that just happens to have lots of commas in it.
So, when that parameter is passed to SQL it is necessary to process that single string into multiple parts so that the field can be compared to each part. This is where it gets a little complex as not all database types have all the same features to handle this. MySQL for example does not have a table variable that MS SQL Server provides.
So. A simple method, for MySQL is this:
SET #param := '105,110,125,135,145,155,165,175,185,195,205';
SELECT
*
FROM Users
WHERE FIND_IN_SET(facebook_id, #param) > 0
;
FIND_IN_SET Return the index position of the first argument
within the second argument
Just how well this scales in your database I cannot tell, it might not be acceptable for parameters containing 1000+ id's.
So if text processing like FIND_IN_SET is too slow, then each id needs to be broken out from the parameter and inserted into a table. That way the resulting table can be used through an INNER JOIN to filter the users; but this requires a table and inserts which take time, and there may be concurrency issues if more than one user is attempting to use that table at the same time.
Using the following sets-up a table of 10,000 integers (1 to 10,000)
/* Create a table called Numbers */
CREATE TABLE `Numbers`
(
`Number` int PRIMARY KEY
);
/* use cross joins to create 10,000 integers from 1 & store into table */
INSERT INTO Numbers (Number)
select 1 + (a.a + (10 * b.a) + (100 * c.a) + (1000 * d.a)) as N
from (select 0 as a union all select 1 union all select 2 union all select 3 union all select 4 union all select 5 union all select 6 union all select 7 union all select 8 union all select 9) as a
cross join (select 0 as a union all select 1 union all select 2 union all select 3 union all select 4 union all select 5 union all select 6 union all select 7 union all select 8 union all select 9) as b
cross join (select 0 as a union all select 1 union all select 2 union all select 3 union all select 4 union all select 5 union all select 6 union all select 7 union all select 8 union all select 9) as c
cross join (select 0 as a union all select 1 union all select 2 union all select 3 union all select 4 union all select 5 union all select 6 union all select 7 union all select 8 union all select 9) as d
;
This "utility table" can then be used to divide a comma separated parameter into a derived table of the individual integers, and this then used in an INNER JOIN to your users table will provide the wanted result.
SET #param := '105,110,125,135,145,155,165,175,185,195,205';
SET #delimit := ',';
SELECT
users.id
, users.facebook_id
, users.name
FROM users
INNER JOIN (
SELECT
CAST(SUBSTRING(iq.param, n.number + 1, LOCATE(#delimit, iq.param, n.number + 1) - n.number - 1) AS UNSIGNED INTEGER) AS itemID
FROM (
SELECT
concat(#delimit, #param, #delimit) AS param
) AS iq
INNER JOIN Numbers n
ON n.Number < LENGTH(iq.param)
WHERE SUBSTRING(iq.param, n.number, 1) = #delimit
) AS derived
ON users.facebook_id = derived.itemID
;
This query can be used as the basis for a stored procedure which might be easier for you to call from PHP.
See this SQLFiddle demo

Related

Creating dynamic mysql queries

Is it possible to create sql file which can get two number parameters and use them in a loop, that in each iteration we do replace into directive using the two parameters, and increment them at the end of the loop?
Can someone show me how to do so?
Edit: Consider I want to update table named zip code, I want to insert new codes in this way:
You get two parameters which are numbers.
The first is the a start code for example: 1000
The second is number of sequential codes to add , lets say 5.
So you will update the table with 1000, 1001... 1004
SQL query cannot do loops, but you can "emulate" them by generating some data and then describing what you want to do with them in declarative way:
-- your input variables
set #start = 1000;
set #count = 5;
select val as zip from (
-- generate some numbers starting with the value of #start
select #start + (a.a + (10 * b.a) + (100 * c.a)) as val
from (
-- this creates cross join of 3 tables of numbers 0-9
-- so the select up there gets rows with values 0-999
-- you can add another cross join and 1000*d.a to get 0-9999
select 0 as a union all select 1 union all select 2 union all select 3 union all select 4 union all select 5 union all select 6 union all select 7 union all select 8 union all select 9) as a
cross join (select 0 as a union all select 1 union all select 2 union all select 3 union all select 4 union all select 5 union all select 6 union all select 7 union all select 8 union all select 9) as b
cross join (select 0 as a union all select 1 union all select 2 union all select 3 union all select 4 union all select 5 union all select 6 union all select 7 union all select 8 union all select 9) as c
) tmp
-- we should generate enough numbers to always cover the needs
-- then this condition will filter only currently needed values
where (val >= #start) and (val < #start + #count)
See it at http://sqlfiddle.com/#!9/9eecb7d/17037
I used this in few cases to generate all days between some dates (mostly to fill gaps in data for reporting) and it was surprisingly fast even if I tried big numbers. But if you know what the maximal value of #count can be, you can just use that much.

mysql find numbers in query that are NOT in table

Is there a simple way to compare a list of numbers in my query to a column in a table to return the ones that are NOT in the db?
I have a comma separated list of numbers (1,57, 888, 99, 76, 490, etc etc) that I need to compare to the number column in a table in my DB. SOME of those numbers are in the table, some are not. I need the query to return those that are in my comma separated list, but are NOT in the DB...
I would put the list of numbers to be checked in a table of their own, then use WHERE NOT EXISTS to check whether they exist in the table to be queried. See this SQLFiddle demo for an example of how this might be accomplished:
If you're comfortable with this syntax, you can even avoid putting into a temp table:
SELECT * FROM (
SELECT 1 AS mycolumn
UNION
SELECT 2
UNION
SELECT 3
UNION
SELECT 4
UNION
SELECT 5
UNION
SELECT 6
UNION
SELECT 7
) a
WHERE NOT EXISTS ( SELECT 1 FROM mytable b
WHERE b.mycolumn = a.mycolumn )
UPDATE per comments from OP
If you can insert your very long list of numbers into a table, then query as follows to get the numbers that are not found in the other table:
SELECT mynumber
FROM mytableof37000numbers a
WHERE NOT EXISTS ( SELECT 1 FROM myothertable b
WHERE b.othernumber = a.mynumber)
Alternately
SELECT mynumber
FROM mytableof37000numbers a
WHERE a.mynumber NOT IN ( SELECT b.othernumber FROM myothertable b )
Hope this helps.
May be this is what you are looking for.
Convert your CSV to rows using SUBSTRING_INDEX. Use NOT IN operator to find the values which is not present in DB
Then Convert the result back to CSV using Group_Concat.
select group_concat(value) from(
SELECT SUBSTRING_INDEX(SUBSTRING_INDEX(t.a, ',', n.n), ',', -1) value
FROM csv t CROSS JOIN
(
SELECT a.N + b.N * 10 + 1 n
FROM
(SELECT 0 AS N UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) a
,(SELECT 0 AS N UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) b
ORDER BY n
) n
WHERE n.n <= 1 + (LENGTH(t.a) - LENGTH(REPLACE(t.a, ',', '')))) ou
where value not in (select a from db)
SQLFIDDLE DEMO
CSV TO ROWS referred from this ANSWER
You could use the 'IN' clause of MySQL. Maybe check this out IN clause tutorial

how to split a string to 4 characters sub strings then concat them by hyphen?

I need to write a mysql query that does the following:
I have list of serial numbers in a database table, the query should only update serial number rows where the length of a serial number is more than 15, then each serial number is divided to 4-digits sub-strings separated by a hyphen. example :
'10028641232912356' should become '1002-8641-1232-9123-56'
I started with this, which only inserts a hyphen after first 4 :
SELECT serial_no, CHAR_LENGTH(TRIM(serial_no)) as 'length',
CONCAT_WS('-',SUBSTRING(TRIM(serial_no),1,4),
SUBSTRING(TRIM(serial_no),5,4)) as result
FROM pos where
serial_no is not null and CHAR_LENGTH(TRIM(serial_no))>=15 ;
this is only the select statement, at first I just want to get (as in select) the new format of the serial no then i'll update it, but since I dont know the exact length of each serial number I need to figure out this part:
CONCAT_WS('-',SUBSTRING(TRIM(serial_no),1,4)
This is must only be done using mysql functions
Any help is appreciated
Introduction
Since we've figured out that your strings can have any length; your issue boils down to how to split strings with any length in to it's chunks of some length in MySQL. Unfortunately there is a serious problem in MySQL that prevent us from doing this in "native" way. And this problem is - MySQL does not support sequences. That means you can not use some sort of internal iterator like construct to loop over your string.
But there is always a way
Building sequence
We can use a trick to do this. First part of trick: use CROSS JOIN to produce the desired row sets. If you are not aware how it works then I'll remind you. It will produce a Cartesian product of the two row sets. The second part of the trick: use a well known formula.
N = d1x101 + d2x102 + ...
Actually you can do that with any base, not just 10. For this demonstration I will use 10. Usage:
SELECT
n1.i+10*n2.i AS num
FROM
(SELECT 0 as i UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) as n1
CROSS JOIN
(SELECT 0 as i UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) as n2
So we're using UNION ALL for the numbers 0..9 to produce a multiplication part (our di). With the query above you'll get consecutive 0..99 numbers. This is because we're using base powers till 2 only, and 102=100. You can check fiddle to make sure that it will work properly.
Iterating through string
Now with these "pseudo-generator" we can emulate iteration through the string. To do that, there are MySQL variables that will be our iterator. Of course separation of string piece is a work for SUBSTR(). So basic skel is:
SELECT
SUBSTR(#str, #i:=#i+#len, #len) as chunk
FROM
(SELECT 0 as i UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) as n1
CROSS JOIN
(SELECT 0 as i UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) as n2
CROSS JOIN
(SELECT #str:='ABCD1234EFGH5678IJKL90', #len:=4, #i:=1-#len) as init
LIMIT 6;
(fiddle for sample above is here). We are just iterating through sequence and then using the iterator to create the correct offset. All that is left to do now is gather our string and insert the hyphens. Hopefully there is GROUP_CONCAT() in MySQL for that:
SELECT
GROUP_CONCAT(chunk SEPARATOR '-') AS string
FROM
(SELECT
SUBSTR(#str, #i:=#i+#len, #len) as chunk
FROM
(SELECT 0 as i UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) as n1
CROSS JOIN
(SELECT 0 as i UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) as n2
CROSS JOIN
(SELECT #str:='ABCD1234EFGH5678IJKL90', #len:=4, #i:=1-#len) as init
) AS data
WHERE chunk!='';
(again, fiddle is here).
With whole table
Now it is a sample and you want to select that from a table. It will be more complicated:
SELECT
serial_no,
GROUP_CONCAT(chunk SEPARATOR '-') AS serial
FROM
(SELECT
SUBSTR(
IF(#str=serial_no, #str, serial_no),
#i:=IF(#str=serial_no, #i+#len, 1),
#len
) AS chunk,
#str:=serial_no AS serial_no
FROM
(SELECT
serial_no
FROM
pos
CROSS JOIN
(SELECT 0 as i UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) as n1
CROSS JOIN
(SELECT 0 as i UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) as n2
ORDER BY
serial_no) AS data
CROSS JOIN
(SELECT #len:=4, #i:=1-#len) AS init) AS seq
WHERE
chunk!=''
GROUP BY
serial_no;
Fiddle is available here. There is another trick to produce the row sets. We'll use CROSS JOIN now instead of initializing string. So we should pass the current pos value of the serial_no. In order to make sure all the rows will be iterated properly we have to order them (that is done with inner ORDER BY).
Limitations
Well as you already know, sequence is limited with 99; thus, with our #len defined as 4 we'll be able to split only by a 4000 length string maximum. Also this will use the whole sequence in any case. Even if your string is much shorter (chances are it is). Thus performance impact may have place.
Is it worth the effort?
My point is: mostly, no. It may be ok to use it once, maybe. But it won't be re-usable and it won't be readable. Thus there is little sense in doing such string operations with DBMS functions because we have applications for such things. It should be used for that and using it you actually can create re-usable/scalable/or/whatever code.
Another way may be to create stored procedure in which we do the desired thing (so, split the string by given length & concatenate it with given delimiter). But honestly, it's just an attempt to "hide the problem". Because even if it would be re-usable, it still will have the same weakness; performance impact. Even more so if we're going to create code for the DBMS. Then again, why don't we place that code in the application? In 99% of cases DBMS is the place for data storage and the application is the place for code (e.g. logic). Mixing these two things almost always ends in a bad result.
Making an assumption that the serial number is unique then the following should do it:-
SELECT serial_no, GROUP_CONCAT(SUBSTRING(serial_no, anInt, 4) ORDER BY anInt SEPARATOR '-')
FROM
(
SELECT serial_no, anInt
FROM pos
CROSS JOIN
(
SELECT (4 * (units.i + 10 * tens.i + 100 * hundreds.i)) + 1 AS anInt
FROM (SELECT 0 i UNION SELECT 1 UNION SELECT 2 UNION SELECT 3 UNION SELECT 4 UNION SELECT 5 UNION SELECT 6 UNION SELECT 7 UNION SELECT 8 UNION SELECT 9) units
CROSS JOIN (SELECT 0 i UNION SELECT 1 UNION SELECT 2 UNION SELECT 3 UNION SELECT 4 UNION SELECT 5 UNION SELECT 6 UNION SELECT 7 UNION SELECT 8 UNION SELECT 9) tens
CROSS JOIN (SELECT 0 i UNION SELECT 1 UNION SELECT 2 UNION SELECT 3 UNION SELECT 4 UNION SELECT 5 UNION SELECT 6 UNION SELECT 7 UNION SELECT 8 UNION SELECT 9) hundreds
) sub1
WHERE LENGTH(serial_no) >= anInt
AND LENGTH(serial_no) > 15
) sub2
GROUP BY serial_no
This will cope with a serial number up to ~4000 characters long.
This works by using a series of unioned fixed queries to get the numbers 0 to 9. This is cross joined against itself 3 times with the first being units, the next being tens and the last being hundreds (you can carry on adding more if you want). From these the numbers between 0 and 999 are generated, then multiplied by 4 and with 1 added (so giving 1 to 3997 in steps of 4), which is the starting position of each group of 4. The WHERE clause checks that this generated number is less than the length of serial_no (if it is you land up with duplicates), and that serial_no is longer than 15.
This will generate a list of all the numbers, each one repeated as many times as their are groups of 4 numbers (or partial groups), along with the start position of a group.
The outer SELECT then takes this list and uses substring to extract each group, and uses GROUP_CONCAT to join he results together again with '-' as the separator between each group. If also specifies the start position of each group as the order to join them again (would probably be fine without this, but I wouldn't guarantee it).
SQL fiddle here:-
http://www.sqlfiddle.com/#!2/eb2d0/2

Sql syntax to insert every thousandth positive integer, starting from 1 and up until 1 million?

I have the following table:
CREATE TABLE dummy (
thousand INT(10) UNSIGNED,
UNIQUE(thousand)
);
Is there sql syntax I can use to insert every thousandth positive integer, starting from 1 and up until 1 million? I can achieve this in php, but I was wondering if this was possible without using a stored procedure.
thousand
1
1001
2001
3001
4001
...
998001
999001
insert into dummy
( thousand )
select
PreQuery.thousand
from
( select
#sqlvar thousand,
#sqlvar := #sqlvar + 1000
from
AnyTableWithAtLeast1000Records,
( select #sqlvar := 1 ) sqlvars
limit 1000 ) PreQuery
You can insert from a select statement. Using MySQL Variables, start with 1. Then, join to ANY table in your system that may have 1000 (or more) records just to generate a row. Even though not getting any actual column from such table, we just need it for the record position. Then the #sqlvar starts at 1 and is returned in column named thousand. Then, immediately add 1000 to it for the next record in the "AnyTable..."
Unfortunately, mysql doesn't support this with any special SQL function.
You have to populate a table, which isn't such a big deal - it would only be 1000 rows.
You can also hack together a temporary table with unions, but that's hardly elegant - might as well use a table.
Other databases do support it, eg with postgres' generate_series() function, but that is little consolation.
As a side note, I often find it handy to have a table populated with consecutive numbers from 1 up to a large numebr for just such as occasion, and I would just select 1000 * num from numbers where num <= 1000.
For a one statement query, i.e. without introducing any additional tables and supporting statement, I'd use this approach in 'pseudo' SQL:
SELECT (D1.Digit + D2.Digit + D3.Digit)* 1000 + 1
FROM
(
SELECT 0 AS Digit UNION ALL
SELECT 1 UNION ALL
SELECT 2 UNION ALL
SELECT 3 UNION ALL
SELECT 4 UNION ALL
SELECT 5 UNION ALL
SELECT 6 UNION ALL
SELECT 7 UNION ALL
SELECT 8 UNION ALL
SELECT 9
) AS D1
CROSS JOIN
(
SELECT 0 AS Digit UNION ALL
SELECT 10 UNION ALL
SELECT 20 UNION ALL
SELECT 30 UNION ALL
SELECT 40 UNION ALL
SELECT 50 UNION ALL
SELECT 60 UNION ALL
SELECT 70 UNION ALL
SELECT 80 UNION ALL
SELECT 90
) AS D2
CROSS JOIN
(
SELECT 0 AS Digit UNION ALL
SELECT 100 UNION ALL
SELECT 200 UNION ALL
SELECT 300 UNION ALL
SELECT 400 UNION ALL
SELECT 500 UNION ALL
SELECT 600 UNION ALL
SELECT 700 UNION ALL
SELECT 800 UNION ALL
SELECT 900
) AS D3
WHERE ((D1.Digit + D2.Digit + D3.Digit)* 1000 + 1) < 1000000
I'm not 100% sure but it should run fine in mysql or perhaps require some minor change.
If you're able to reuse parts of the query, it becomes much prettier, for example in SQL Server, I'd write it as follows:
WITH Digits AS
(
SELECT 0 AS Digit UNION ALL
SELECT 1 UNION ALL
SELECT 2 UNION ALL
SELECT 3 UNION ALL
SELECT 4 UNION ALL
SELECT 5 UNION ALL
SELECT 6 UNION ALL
SELECT 7 UNION ALL
SELECT 8 UNION ALL
SELECT 9
)
SELECT (D1.Digit + D2.Digit + D3.Digit)* 1000 + 1
FROM (SELECT Digit FROM Digits) AS D1
CROSS JOIN (SELECT Digit * 10 AS Digit FROM Digits) AS D2
CROSS JOIN (SELECT Digit * 100 AS Digit FROM Digits) AS D3
WHERE ((D1.Digit + D2.Digit + D3.Digit)* 1000 + 1) < 1000000
Keep an eye on where multiplication happens, it might be more efficient to multiply in sub-queries, rather than in the resulting expression.
Here is a trick, that I usually use for those sort of problems, similar to #sergeBelov solution:
Create an anchor table, a temp table, and fill it with values from 0 to 9, like so:
CREATE TABLE TEMP (Digit int);
INSERT INTO Temp VALUES(0),(1),(2),(3),(4),(5),(6),(7),(8),(9);
Then you can do this:
INSERT INTO dummy(thousand)
SELECT 1 + (id - 1) * 1000 AS n
FROM
(
SELECT t3.digit * 100 + t2.digit * 10 + t1.digit + 1 AS id
FROM TEMP AS t1
CROSS JOIN TEMP AS t2
CROSS JOIN TEMP AS t3
) t;
SQL Fiddle Demo
How does this work?
The sequence numbers(1, 1001, 2001, ... , 998001, 999001) 1000 terms, that you are looking for, is what they called Arithmetic progression, and in your case the nth term of the sequence (an) is given by:
A + (n - 1) * d
In you sequence: a = 1, d = 1000
Where A is the first term of the sequence, n is the term and d is the difference between each two terms(it is the same for each two successive terms).
The subquery:
SELECT t3.digit * 100 + t2.digit * 10 + t1.digit + 1 AS id
FROM TEMP AS t1
CROSS JOIN TEMP AS t2
CROSS JOIN TEMP AS t3;
Will generate a list of numbers from 1 to 1000(the total number of terms in your sequence), after that we got each term in the sequence from these number by 1 + (id - 1) * 1000 in the outer select.

Changing a Query with a numbered result set (with gaps,) to return result with no gaps, containing every number.

I have a select statement: select a, b, [...]; which returns the results:
a|b
---------
1|8688798
2|355744
4|457437
7|27834
I want it to return:
a|b
---------
1|8688798
2|355744
3|0
4|457437
5|0
6|0
7|27834
An example query that does not do what I would like, since it does not have the gap numbers:
select
sub.num_of_ratings,
count(sub.rater)
from
(
select
r.rater_id as rater,
count(r.id) as num_of_ratings
from ratings r
group by rater
) as sub
group by num_of_ratings;
Explanation of the query:
If a user rates another user, the rating is listed in the table ratings and the id of the rating user is kept in the field rater_id. Effectively I check for all users who are referred to in ratings and count how many ratings records I find for that user, which is rater / num_of_ratings, and then I use this result to find how many users have rated a given number of times.
At the end I know how many users rated once, how many users rated twice, etc. My problem is that the numbers for count(sub.rater) start fine from 1,2,3,4,5... However, for bigger numbers there are gaps. This is because there might be one user who rated 1028 times - but no user who rated 1027 times.
I don't want to apply stored procedures looping over the result or something like that. Is it possible to fill those gaps in the result without using stored procedures, looping, or creating temporary tables?
If you have a sequence of numbers, then you can do a JOIN with that table and fill in the gaps properly.
You can check out this questions on how to get the sequence:
generate an integer sequence in MySQL
Here is one of the answers posted that might be easily used with the limitation that generates numbers from 1 to 10,000:
SELECT #row := #row + 1 as row FROM
(select 0 union all select 1 union all select 3 union all select 4 union all select 5 union all select 6 union all select 6 union all select 7 union all select 8 union all select 9) t,
(select 0 union all select 1 union all select 3 union all select 4 union all select 5 union all select 6 union all select 6 union all select 7 union all select 8 union all select 9) t2,
(select 0 union all select 1 union all select 3 union all select 4 union all select 5 union all select 6 union all select 6 union all select 7 union all select 8 union all select 9) t3,
(select 0 union all select 1 union all select 3 union all select 4 union all select 5 union all select 6 union all select 6 union all select 7 union all select 8 union all select 9) t4,
(SELECT #row:=0) t5
Using a sequence of numbers, you can join your result set. For instance, assuming your number list is in a table called numbersList, with column number:
Select number, Count
from
numbersList left outer join
(select
sub.num_of_ratings,
count(sub.rater) as Count
from
(
select
r.rater_id as rater,
count(r.id) as num_of_ratings
from ratings r
group by rater
) as sub
group by num_of_ratings) as num
on num.num_of_ratings=numbersList.number
where numbersList.number<max(num.num_of_ratings)
Your numbers list must be larger than your largest value, obviously, and the restriction will allow it to not have all numbers up to the maximum. (If MySQL does not allow that type of where clause, you can either leave the where clause out to list all numbers up to the maximum, or modify the query in various ways to achieve the same result.)
#mazzucci: the query is too magical and you are not actually explaining the query.
#David: I cannot create a table for that purpose (as stated in the question)
Basically what I need is a select that returns a gap-less list of numbers. Then I can left join on that result set and treat NULL as 0.
What I need is an arbitrary table that keeps more records than the length of the final list. I use the table user for that in the following example:
select #row := #row + 1 as index
from (select #row := -1) r, users u
limit 101;
This query returns a set of the numbers von 0 to 100. Using it as a subquery in a left join finally fills the gap.
users is just a dummy to keep the relational engine going and hence producing the numbers incrementally.
select t1.index as a, ifnull(t2.b, 0) as b
from (
select #row := #row + 1 as index
from (select #row := 0) r, users u
limit 7
) as t1
left join (
select a, b [...]
) as t2
on t1.index = t2.a;
I didn't try this very query live, so have merci with me if there is a little flaw. but technically it works. you get my point.
EDIT:
just used this concept to gain a gapless list of dates to left join measures onto it:
select #date := date_add(#date, interval 1 day) as date
from (select #date := '2010-10-14') d, users u
limit 700
starts from 2010/10/15 and iterates 699 more days.