Need to get all referrals ID with using MYSQL - mysql

I having a referral table like below.
> id referredByID referrerID
>
> 1001 1 2
>
> 1002 2 3
>
> 1003 2 4
>
> 1004 5 7
From the above table structure i need to get the users whom i referred and the users whom are referred by their referrals.
For Example:
I am referredByID-1
I referred the ID - 2
Now the ID - 2 referred ID -3
And in the same case ID-2 referred ID - 4
Now my output needs to be look like:
Referrals Done By Me:
id - 2
id - 3
id - 4
How can this be done using MYSQL.
Any help will be appreciated.. Thanks in advance...

I think I got everything the right way round but your naming conventions confused me so you'd better check everything.
If I call the following stored procedure:
call referrals_hier(1);
I get the following results:
+--------------+------------+-------+
| referredByID | referrerID | depth |
+--------------+------------+-------+
| 1 | 2 | 0 |
| 2 | 3 | 1 |
| 2 | 4 | 1 |
+--------------+------------+-------+
3 rows in set (0.00 sec)
full script here: http://pastie.org/1466596
Stored procedure
drop table if exists referrals;
create table referrals
(
id smallint unsigned not null primary key,
referrerID smallint unsigned not null,
referredByID smallint unsigned null,
key (referredByID)
)
engine = innodb;
insert into referrals (id, referredByID, referrerID) values
(1001,1,2),(1002,2,3),(1003,2,4),(1004,5,7);
drop procedure if exists referrals_hier;
delimiter #
create procedure referrals_hier
(
in p_refID smallint unsigned
)
begin
declare v_done tinyint unsigned default(0);
declare v_dpth smallint unsigned default(0);
create temporary table hier(
referredByID smallint unsigned,
referrerID smallint unsigned,
depth smallint unsigned
)engine = memory;
insert into hier select referredByID, referrerID, v_dpth from referrals where referredByID = p_refID;
/* http://dev.mysql.com/doc/refman/5.0/en/temporary-table-problems.html */
create temporary table tmp engine=memory select * from hier;
while not v_done do
if exists( select 1 from referrals e inner join hier on e.referredByID = hier.referrerID and hier.depth = v_dpth) then
insert into hier select e.referredByID, e.referrerID, v_dpth + 1
from referrals e inner join tmp on e.referredByID = tmp.referrerID and tmp.depth = v_dpth;
set v_dpth = v_dpth + 1;
truncate table tmp;
insert into tmp select * from hier where depth = v_dpth;
else
set v_done = 1;
end if;
end while;
select * from hier order by depth;
drop temporary table if exists hier;
drop temporary table if exists tmp;
end #
delimiter ;
-- call this sproc from your php
call referrals_hier(1);
Hope this helps :)

There are two ways, both described here with examples:
http://mikehillyer.com/articles/managing-hierarchical-data-in-mysql/

These are the cases where MySQL's lack of support for recursive common table expressions really hurts.
If you have an upper limit on the levels, then you might be able to do this with several self joins:
SELECT l1.referredID, l2.referredID, ...
FROM your_table l1
LEFT JOIN your_table l2 ON l2.referredByID = l1.referredID
LEFT JOIN your_table l3 ON l3.referredByID = l2.referredID
LEFT JOIN your_table l4 ON l4.referredByID = l3.referredID
... (you get the picture)
Now as you can see this gets ugly when having more levels and also will not perform very well for larger sets.
If you cannot change your table design then I would suggest to make a good guess on the maximum depth that you can have and create a view that will retrieve all levels. At least that makes it easier in the application or for ad-hoc queries.
On top of that (huge self join) view, you can also build another view that returns each level as its own row. But that will be even slower.
But as long as you deal with MySQL the best thing to do is to change the table design to use the nested set model which is described in the link to the MySQL manual that Anonymous87 has posted.

Related

Does MySQL InnoDB create consistent snapshots for SELECT on multiple tables with UNION when isolation level is READ COMMITTED

Consider two tables like this:
TABLE: current
-------------------
| id | dept | value |
|----|------|-------|
| 4| A | 20 |
| 5| B | 15 |
| 6| A | 25 |
-------------------
TABLE: history
-------------------
| id | dept | value |
|----|------|-------|
| 1| A | 10 |
| 2| C | 10 |
| 3| B | 20 |
-------------------
These are just simple examples... in the actual system both tables have considerably more columns and considerably more rows (10k+ rows in current and 1M+ rows in history).
A client application is continuously (several times a second) inserting new rows into the current table, and 'moving' older existing rows from current to history (delete/insert within a single transaction).
Without blocking the client in this activity we need to take a consistent sum of values per dept across the two tables.
With transaction isolation level set to REPEATABLE READ we could just do:
SELECT dept, sum(value) FROM current GROUP BY dept;
followed by
SELECT dept, sum(value) FROM history GROUP BY dept;
and add the two sets of results together. BUT each query would block inserts on its respective table.
Changing the isolation level to READ COMMITTED and doing the same two SQLs would avoid blocking inserts, but now there is a risk of entries being double counted if moved from current to history while we are querying (since each SELECT creates its own snapshot).
Here's the question then.... what happens with isolation level READ COMMITTED if I do a UNION:
SELECT dept, sum(value) FROM current GROUP BY dept
UNION ALL
SELECT dept, sum(value) FROM history GROUP BY dept;
Will MySQL generate a consistent snapshot of both tables at the same time (thereby removing the risk of double counting) or will it still take snapshot one table first, then some time later take snapshot of the second?
I have not yet found any conclusive documentation to answer my question, so I went about trying to prove it instead. Although not proof in the scientific sense, my findings suggest a consistent snapshot is created for all tables in a UNION query.
Here's what I did.
Create the tables
DROP TABLE IF EXISTS `current`;
CREATE TABLE IF NOT EXISTS `current` (
`id` BIGINT NOT NULL COMMENT 'Unique numerical ID.',
`dept` BIGINT NOT NULL COMMENT 'Department',
`value` BIGINT NOT NULL COMMENT 'Value',
PRIMARY KEY (`id`));
DROP TABLE IF EXISTS `history`;
CREATE TABLE IF NOT EXISTS `history` (
`id` BIGINT NOT NULL COMMENT 'Unique numerical ID.',
`dept` BIGINT NOT NULL COMMENT 'Department',
`value` BIGINT NOT NULL COMMENT 'Value',
PRIMARY KEY (`id`));
Create a procedure that sets up 10 entries in the current table (id = 0, .. 9), then sits in a tight loop inserting 1 new row into current and 'moving' the oldest row from current to history. Each iteration is performed in a transaction, as a result the current table remains at a steady 10 rows, while the history table grows quickly. At any point in time min(current.id) = max(history.id) + 1
DROP PROCEDURE IF EXISTS `idLoop`;
DELIMITER $$
CREATE PROCEDURE `idLoop`()
BEGIN
DECLARE n bigint;
-- Populate initial 10 rows in current table if not already there
SELECT IFNULL(MAX(id), -1) + 1 INTO n from current;
START TRANSACTION;
WHILE n < 10 DO
INSERT INTO current VALUES (n, n % 10, n % 1000);
SET n = n + 1;
END WHILE;
COMMIT;
-- In tight loop, insert new row and 'move' oldest current row to history
WHILE n < 10000000 DO
START TRANSACTION;
-- Insert new row to current
INSERT INTO current values(n, n % 10, n % 1000);
-- Move oldest row from current to history
INSERT INTO history SELECT * FROM current WHERE id = (n - 10);
DELETE FROM current where id = (n - 10);
COMMIT;
SET n = n + 1;
END WHILE;
END$$
DELIMITER ;
Start this procedure running (this call won't return for some time - which is intentional)
call idLoop();
In another session on the same database we can now try out a variation on the UNION ALL query in my original posting.
I have modified it to (a) slow down execution,and (b) return a simple result set (two rows) that indicates whether any entries 'moved' whilst the query was running have been missed or double counted.
SET SESSION TRANSACTION ISOLATION LEVEL READ COMMITTED;
SELECT 'HST' AS src, MAX(id) AS idx, COUNT(*) AS cnt, SUM(value) FROM history WHERE dept IN (0, 1, 2, 3, 4, 5, 6, 7, 8, 9)
UNION ALL
SELECT 'CRT' AS src, MIN(id) AS idx, COUNT(*) AS cnt, SUM(value) FROM current WHERE dept IN (0, 1, 2, 3, 4, 5, 6, 7, 8, 9);
The sum(value) and where dept in (...) are just there to add work to the query and slow it down.
The indication of a positive outcome is if the two idx values are adjacent, like this:
+-----+--------+--------+------------+
| src | idx | cnt | SUM(value) |
+-----+--------+--------+------------+
| HST | 625874 | 625875 | 312569875 |
| CRT | 625875 | 10 | 8795 |
+-----+--------+--------+------------+
2 rows in set (1.43 sec)
I'd still be happy to hear any authoritative information on this.

Manage revisions in postgresql

I am currently trying to manage revisions of a data set in a postgreSql database. The table I would like to use has the following structure:
CREATE TABLE dataset (
id BIGSERIAL PRIMARY KEY,
revision INTEGER NOT NULL,
object_id BIGINT NOT NULL
);
The id field is a unique auto-increment identifier. The object_id should be the identifier for a object, while revision keeps track of the revisions:
id | object_id | revision
-------------------------
1 | 1 | 1
2 | 2 | 1
3 | 1 | 2
4 | 1 | 3
5 | 3 | 1
6 | 4 | 1
What I now need is a function, that:
Sets a auto-increment object_id and sets revision to 1, if no object_id is provided.
Sets a auto-increment revision for this object_id, if an object_id is provided.
I already found this answer, but this does not really solve the problem of creating consecutive revisions for a object_id and it does not solve the problem of auto creating consecutive object_ids.
EDIT:
I would do something like the following, but this doesn't feel very comfortable:
CREATE OR REPLACE FUNCTION update_revision() RETURNS TRIGGER LANGUAGE plpgsql AS
$$
BEGIN
IF tg_op='INSERT' THEN
IF NEW.object_id != NULL THEN
NEW.object_id = SELECT nextval(object_id_seq_id);
NEW.revision = 1;
ELSE
NEW.revision = SELECT MAX(revision)+1 FROM dataset WHERE spot_id = NEW.spot_id;
END IF;
END IF;
RETURN NEW;
END;
$$;
CREATE TRIGGER update_revision BEFORE INSERT OR UPDATE ON dataset
FOR EACH ROW EXECUTE PROCEDURE update_revision();
Make (object_id, revision) unique. BTW why aren't they the primary key?
create table dataset (
id bigserial primary key,
object_id bigint not null,
revision integer not null,
unique (object_id, revision)
);
create or replace function include_revision (_object_id integer)
returns dataset as $$
with object_id as (
select coalesce(max(object_id), 0) + 1 as object_id
from dataset
), revision as (
select coalesce(max(revision), 0) + 1 as revision
from dataset
where object_id = _object_id
)
insert into dataset (object_id, revision)
select
coalesce(_object_id, (select object_id from object_id)),
(select revision from revision)
returning *
;
$$ language sql;
object_id is set to coalesce(_object_id, (select object_id from object_id)), that is, only if _object_id is null it will use the calculated max(object_id)
Testing:
select include_revision(null);
include_revision
------------------
(1,1,1)
select include_revision(1);
include_revision
------------------
(2,1,2)
select include_revision(null);
include_revision
------------------
(3,2,1)

Split comma separated values from one column to 2 rows in the results. MySQL

MySQL. Two columns, same table.
Column 1 has product_id
Column 2 has category_ids (sometimes 2 categories, so will look like 23,43)
How do i write a query to return a list of product_id, category_ids, with a seperate row if there is more than 1 category_id associated with a product_id.
i.e
TABLE:
product_id | category_ids
100 | 200,300
101 | 201
QUERY RESULT: Not trying to modify the table
100 | 200
100 | 300
101 | 201
EDIT: (note) I don't actually wish to manipulate the table at all. Just doing a query in PHP, so i can use the data as needed.
Your database table implementation seems bad designed, however in your case what you need would be a reverse function of GROUP_CONCAT, but unfortunately it doesn't exist in MySQL.
You have two viable solutions :
Change the way you store the data (allow duplicate on the product_id field and put multiple records with the same product_id for different category_id)
Manipulate the query result from within your application (you mentioned PHP in your question), in this case you have to split the category_ids column values and assemble a result set by your own
There is also a third solution that i have found that is like a trick (using a temporary table and a stored procedure), first of all you have to declare this stored procedure :
DELIMITER $$
CREATE PROCEDURE csv_Explode( sSepar VARCHAR(255), saVal TEXT )
body:
BEGIN
DROP TEMPORARY TABLE IF EXISTS csv_Explode;
CREATE TEMPORARY TABLE lib_Explode(
`pos` int unsigned NOT NULL auto_increment,
`val` VARCHAR(255) NOT NULL,
PRIMARY KEY (`pos`)
) ENGINE=Memory COMMENT='Explode() results.';
IF sSepar IS NULL OR saVal IS NULL THEN LEAVE body; END IF;
SET #saTail = saVal;
SET #iSeparLen = LENGTH( sSepar );
create_layers:
WHILE #saTail != '' DO
# Get the next value
SET #sHead = SUBSTRING_INDEX(#saTail, sSepar, 1);
SET #saTail = SUBSTRING( #saTail, LENGTH(#sHead) + 1 + #iSeparLen );
INSERT INTO lib_Explode SET val = #sHead;
END WHILE;
END; $$
DELIMITER ;
Then you have to call the procedure passing the array in the column you want to explode :
CALL csv_explode(',', (SELECT category_ids FROM products WHERE product_id = 100));
After this you can show results in the temporary table in this way :
SELECT * FROM csv_explode;
And the result set will be :
+-----+-----+
| pos | val |
+-----+-----+
| 1 | 200 |
| 2 | 300 |
+-----+-----+
It could be a starting point for you ...

generate_series in MySQL

What is the PostgreSQL's generate_series() equivalent in MySQL?
How to convert this query to MySQL?
select substr('some-string', generate_series(1, char_length('some-string')))
Sample output from PostgreSQL:
some-string
ome-string
me-string
e-string
-string
string
tring
ring
ing
ng
g
select generate_series(1, char_length('some-string'))
1
2
3
4
5
6
7
8
9
10
11
Final solution:
CREATE TABLE `numberlist` (
`id` tinyint(4) NOT NULL AUTO_INCREMENT,
PRIMARY KEY (`id`)
)
INSERT INTO `numberlist` values(null)
(repeat the above query the maximum string you need)
SELECT substr('somestring', id)
FROM numberlist
WHERE id <= character_length('somestring')
Here is the concept, but I don't have mySQL installed on this box. You will need to create a table of integers, using AUTO INCREMENT. A table of numbers is generally a handy table to have available in a database, and would only need be created once
create table NumberList (id MEDIUMINT NOT NULL AUTO_INCREMENT,fill char(1))
declare #x INT
set #x=0
while #x < 20
begin
insert into numberList values(null)
Set #x = #x+1
end
Then, join this table as shown below using the LIMIT clause
select substr('somestring',id)
from numberlist
limit len('somestring')
I wrote this in SQL server, but it shouldn't be too difficult to convert to mySQL...
The code below SHOULD work in mySQL
DECLARE xx INT DEFAULT 0;
WHILE xx < 20 DO
insert into numberList values(null)
SET xx = xx + 1;
END WHILE;

Generating Depth based tree from Hierarchical Data in MySQL (no CTEs)

Hi For many days I have been working on this problem in MySQL, however I can not figure it out. Do any of you have suggestions?
Basically, I have a category table with domains like: id, name (name of category), and parent (id of parent of the category).
Example Data:
1 Fruit 0
2 Apple 1
3 pear 1
4 FujiApple 2
5 AusApple 2
6 SydneyAPPLE 5
....
There are many levels, possibly more than 3 levels. I want to create an sql query that groups the datas according to he hierarchy: parent > child > grandchild > etc.
It should output the tree structure, as follows:
1 Fruit 0
^ 2 Apple 1
^ 4 FujiApple 2
- 5 AusApple 2
^ 6 SydneyApple 5
- 3 pear 1
Can I do this using a single SQL query? The alternative, which I tried and does work, is the following:
SELECT * FROM category WHERE parent=0
After this, I loop through the data again, and select the rows where parent=id. This seems like a bad solution. Because it is mySQL, CTEs cannot be used.
You can do it in a single call from php to mysql if you use a stored procedure:
Example calls
mysql> call category_hier(1);
+--------+---------------+---------------+----------------------+-------+
| cat_id | category_name | parent_cat_id | parent_category_name | depth |
+--------+---------------+---------------+----------------------+-------+
| 1 | Location | NULL | NULL | 0 |
| 3 | USA | 1 | Location | 1 |
| 4 | Illinois | 3 | USA | 2 |
| 5 | Chicago | 3 | USA | 2 |
+--------+---------------+---------------+----------------------+-------+
4 rows in set (0.00 sec)
$sql = sprintf("call category_hier(%d)", $id);
Hope this helps :)
Full script
Test table structure:
drop table if exists categories;
create table categories
(
cat_id smallint unsigned not null auto_increment primary key,
name varchar(255) not null,
parent_cat_id smallint unsigned null,
key (parent_cat_id)
)
engine = innodb;
Test data:
insert into categories (name, parent_cat_id) values
('Location',null),
('USA',1),
('Illinois',2),
('Chicago',2),
('Color',null),
('Black',3),
('Red',3);
Procedure:
drop procedure if exists category_hier;
delimiter #
create procedure category_hier
(
in p_cat_id smallint unsigned
)
begin
declare v_done tinyint unsigned default 0;
declare v_depth smallint unsigned default 0;
create temporary table hier(
parent_cat_id smallint unsigned,
cat_id smallint unsigned,
depth smallint unsigned default 0
)engine = memory;
insert into hier select parent_cat_id, cat_id, v_depth from categories where cat_id = p_cat_id;
/* http://dev.mysql.com/doc/refman/5.0/en/temporary-table-problems.html */
create temporary table tmp engine=memory select * from hier;
while not v_done do
if exists( select 1 from categories p inner join hier on p.parent_cat_id = hier.cat_id and hier.depth = v_depth) then
insert into hier
select p.parent_cat_id, p.cat_id, v_depth + 1 from categories p
inner join tmp on p.parent_cat_id = tmp.cat_id and tmp.depth = v_depth;
set v_depth = v_depth + 1;
truncate table tmp;
insert into tmp select * from hier where depth = v_depth;
else
set v_done = 1;
end if;
end while;
select
p.cat_id,
p.name as category_name,
b.cat_id as parent_cat_id,
b.name as parent_category_name,
hier.depth
from
hier
inner join categories p on hier.cat_id = p.cat_id
left outer join categories b on hier.parent_cat_id = b.cat_id
order by
hier.depth, hier.cat_id;
drop temporary table if exists hier;
drop temporary table if exists tmp;
end #
Test runs:
delimiter ;
call category_hier(1);
call category_hier(2);
Some performance testing using Yahoo geoplanet places data
drop table if exists geoplanet_places;
create table geoplanet_places
(
woe_id int unsigned not null,
iso_code varchar(3) not null,
name varchar(255) not null,
lang varchar(8) not null,
place_type varchar(32) not null,
parent_woe_id int unsigned not null,
primary key (woe_id),
key (parent_woe_id)
)
engine=innodb;
mysql> select count(*) from geoplanet_places;
+----------+
| count(*) |
+----------+
| 5653967 |
+----------+
so that's 5.6 million rows (places) in the table let's see how the adjacency list implementation/stored procedure called from php handles that.
1 records fetched with max depth 0 in 0.001921 secs
250 records fetched with max depth 1 in 0.004883 secs
515 records fetched with max depth 1 in 0.006552 secs
822 records fetched with max depth 1 in 0.009568 secs
918 records fetched with max depth 1 in 0.009689 secs
1346 records fetched with max depth 1 in 0.040453 secs
5901 records fetched with max depth 2 in 0.219246 secs
6817 records fetched with max depth 1 in 0.152841 secs
8621 records fetched with max depth 3 in 0.096665 secs
18098 records fetched with max depth 3 in 0.580223 secs
238007 records fetched with max depth 4 in 2.003213 secs
Overall i'm pretty pleased with those cold runtimes as I wouldn't even begin to consider returning tens of thousands of rows of data to my front end but would rather build the tree dynamically fetching only several levels per call. Oh and just incase you were thinking innodb is slower than myisam - the myisam implementation I tested was twice as slow in all counts.
More stuff here : http://pastie.org/1672733
Hope this helps :)
There are two common ways of storing hierarchical data in an RDBMS: adjacency lists (which you are using) and nested sets. There is a very good write-up about these alternatives in Managing Hierarchical Data in MySQL. You can only do what you want in a single query with the nested set model. However, the nested set model makes it more work to update the hierarchical structure, so you need to consider the trade-offs depending on your operational requirements.
You can't achieve this using a single query. Your hierarchical data model is ineffective in this case. I suggest you try two other ways of storing hierarchical data in a database: the MPTT model or the "lineage" model. Using either of those models allows you to do the select you want in a single go.
Here is an article with further details: http://articles.sitepoint.com/article/hierarchical-data-database
The linear way:
I am using a ugly function to create a tree in a simple string field.
/ topic title
/001 message 1
/002 message 2
/002/001 reply to message 2
/002/001/001/ reply to reply
/003 message 3
etc...
the table can be used to select all the rows in the tree order with a simple SQL Query:
select * from morum_messages where m_topic=1234 order by m_linear asc
INSERT is just select the parent linear (and children) and calculate the string as needed.
select M_LINEAR FROM forum_messages WHERE m_topic = 1234 and M_LINEAR LIKE '{0}/___' ORDER BY M_LINEAR DESC limit 0,1
/* {0} - m_linear of the parent message*/
DELETE is simple as delete the message, or delete by linear all replies of the parent one.