I have two tables, one is an index (or map) which helps when other when pulling queries.
SELECT v.*
FROM smv_ v
WHERE (SELECT p.network
FROM providers p
WHERE p.provider_id = v.provider_id) = 'RUU='
AND (SELECT p.va
FROM providers p
WHERE p.provider_id = v.provider_id) = 'MjU='
LIMIT 1;
Because we do not know the name of the column that holds the main data, we need to look it up, using the provider_id which is in both tables, and then query.
I am not getting any errors, but also no data back. I have spent the past hour trying to put this on sqlfiddle, but it kept crashing, so I just wanted to check if my code is really wrong, hence the crashing?
In the above example, I am looking in the providers table for column network, where the provider_id matches, and then use that as the column on smv.
I am sure i have done this before just like this, but after the weekend trying I thought i would ask on here.
Thanks in Advance.
UPDATE
Here is an example of the data:
THis is the providers, this links so no matter what the name of the column on the smv table, we can link them.
+---+---+---------------+---------+-------+--------+-----+-------+--------+
| | A | B | C | D | E | F | G | H |
+---+---+---------------+---------+-------+--------+-----+-------+--------+
| 1 | 1 | Home | network | batch | bs | bp | va | bex |
| 2 | 2 | Recharge | code | id | serial | pin | value | expire |
+---+---+---------------+---------+-------+--------+-----+-------+--------+
In the example above, G will mean in the smv column for recharge we be value. So that is what we would look for in our WHERE clause.
Here is the smv table:
+---+---+-----------+-----------+---+----+---------------------+-----+--+
| | A | B | C | D | E | F | value | va |
+---+---+-----------+-----------+---+----+---------------------+-----+--+
| 1 | 1 | X2 | Home | 4 | 10 | 2016-09-26 15:20:58 | | 7 |
| 2 | 2 | X2 | Recharge | 4 | 11 | 2016-09-26 15:20:58 | 9 | |
+---+---+-----------+-----------+---+----+---------------------+-----+--+
value in the same example as above would be 9, or 'RUU=' decoded.
So we do not know the name of the rows, until the row from smv is called, once we have this, we can look up what column name we need to get the correct information.
Hope this helps.
MORE INFO
At the point of triggering, we do not know what the row consists of the right data because some many of the fields would be empty. The map is there to help we query the right column, to get the right row (smv grows over time depending on whats uploaded.)
1) SELECT p.va FROM providers p WHERE p.network = 'Recharge' ;
2) SELECT s.* FROM smv s, providers p WHERE p.network = 'Recharge';
1) gives me the correct column I need to look up and query smv, using the above examples it would come back with "value". So I need to now look up, within the smv table, C = Recharge, and value = '9'. This should bring me back row 2 of the smv table.
So individually both 1 and 2 queries work, but I need them put together so the query is done on the database server.
Hope this gives more insight
Even More Info
From reading other posts, which are not really doing what I need, i have come up with this:
SELECT s.*
FROM (SELECT
(SELECT p.va
FROM dh_smv_providers p
WHERE p.provider_name = 'vodaphone'
LIMIT 1) AS net,
(SELECT p.bex
FROM dh_smv_providers p
WHERE p.provider_name = 'vodaphone'
LIMIT 1) AS bex
FROM dh_smv_providers) AS val, dh_smv_ s
WHERE s.provider_id = 'vodaphone' AND net = '20'
ORDER BY from_base64(val.bex) DESC;
The above comes back blank, but if i replace net, in the WHERE clause with a column I know exists, I do get the results expected:
SELECT s.*
FROM (SELECT
(SELECT p.va
FROM dh_smv_providers p
WHERE p.provider_name = 'vodaphone'
LIMIT 1) AS net,
(SELECT p.bex
FROM dh_smv_providers p
WHERE p.provider_name = 'vodaphone'
LIMIT 1) AS bex
FROM dh_smv_providers) AS val, dh_smv_ s
WHERE s.provider_id = 'vodaphone' AND value = '20'
ORDER BY from_base64(val.bex) DESC;
So what I am doing wrong, which is net, not showing the value derived from the subquery "value" ?
Thanks
SELECT
v.*,
p.network, p.va
FROM
smv_ v
INNER JOIN
providers p ON p.provider_id = v.provider_id
WHERE
p.network = 'RUU=' AND p.va = 'MjU='
LIMIT 1;
The tables talk to each other via the JOIN syntax. This completely circumvents the need (and limitations) of sub-selects.
The INNER JOIN means that only fully successful matches are returned, you may need to adjust this type of join for your situation but the SQL will return a row of all v columns where p.va = MjU and p.network = RUU and p.provider_id = v.provider_id.
What I was trying to explain in comments is that subqueries do not have any knowledge of their outer query:
SELECT *
FROM a
WHERE (SELECT * FROM b WHERE a)
AND (SELECT * FROM c WHERE a OR b)
This layout (as you have in your question) is that b knows nothing about a because the b query is executed first, then the c query, then finally the a query. So your original query is looking for WHERE p.provider_id = v.provider_id but v has not yet been defined so the result is false.
Related
This question is a bit complicated to me, and I can't explain it in one sentence so the title may seem quite ambiguous.
I have 3 tables in my MySQL database, their structure is shown below:
word_list (5 million rows)
+-----+--------+
| wid | word |
+-----+--------+
| 1 | foo |
| 2 | bar |
| 3 | hello |
+-----+--------+
paper_word_relation (10 million rows)
+-----+-------+
| pid | word |
+-----+-------+
| 1 | 1 |
| 1 | 2 |
| 1 | 3 |
| 2 | 1 |
| 2 | 3 |
+-----+-------+
paper_citation_relation (80K rows)
+----------+--------+
| pid_from | pid_to |
+----------+--------+
| 1 | 2 |
| 1 | 3 |
| 1 | 4 |
| 2 | 1 |
| 2 | 3 |
+----------+--------+
I want to find out how many papers contain word W, and cite the papers also contain word W.(for each word in the list)
I use two inner join to do this job but it seems extremely slow when the word is popular - above 50s (quite fast if the word is rarely used - below 0.1s), here is my code
SELECT COUNT(*) FROM (
SELECT a.pid_from, a.pid_to, b.word FROM paper_citation_relation AS a
INNER JOIN paper_word_relation AS b ON a.pid_from = b.pid
INNER JOIN paper_word_relation AS c ON a.pid_to = c.pid
WHERE b.word = 2 AND c.word = 2) AS d
How can I do this faster? Is my query not efficient enough or it's the problem about the amount of data?
I can only come up with one solution that I delete the words which occur less than 2 in the paper_word_relation table. (About 4 million words only occur once)
Thanks!
If you are only concerned with getting the Count, you should not be first getting the results into a Derived Table, and then Count the rows out. This may create unnecessary temporary tables storing lots of data in-memory. You can directly count the number of rows.
I also think that you need to count unique number of papers. Because of Many-to-Many relationships in paper_citation_relation table, duplicate rows may be coming for a single paper.
SELECT COUNT(DISTINCT a.pid_from)
FROM paper_citation_relation AS a
INNER JOIN paper_word_relation AS b ON a.pid_from = b.pid
INNER JOIN paper_word_relation AS c ON a.pid_to = c.pid
WHERE b.word = 2 AND c.word = 2
For performance, you will need following indexing:
Composite Index on (pid_from, pid_to) in the paper_citation_relation table.
Composite Index on (pid, word) in the paper_word_relation table.
We may also possibly optimize the query further by reducing one join and use conditional AND/OR based filtering in HAVING. You will need to benchmark it though.
SELECT COUNT(*)
FROM (
SELECT a.pid_from
FROM paper_citation_relation AS a
INNER JOIN paper_word_relation AS b
ON (a.pid_from = b.pid OR
a.pid_to = b.pid)
GROUP BY a.pid_from
HAVING SUM(a.pid_from = b.pid AND b.word = 2) AND
SUM(a.pid_to = b.pid AND b.word = 2)
)
After the first 1:n join you get the same pid_to multiple times and your next join is no longer 1:n but n:m, creating a possibly huge intermediate result before the final DISTINCT. It's similar to a CROSS JOIN and it's getting worse for popular words, e.g. 10*10 vs. 1000*1000 rows.
You must remove the duplicates before the join, this should return the same number as #MadhurBhaiya's answer
SELECT Count(*) -- no more DISTINCT needed
FROM
(
SELECT DISTINCT cr.pid_to -- reducing m to 1
FROM paper_citation_relation AS cr
JOIN paper_word_relation AS wr
ON cr.pid_from = wr.pid
WHERE wr.word = 2
) AS dt
JOIN paper_word_relation AS wr
ON dt.pid_to = wr.pid -- 1:n join again
WHERE wr.word = 2
If you want to count the number of papers which have been cited you need to get a distinct list of pid (either pid_from or pid_to) from paper_citation_relation first and then join to the specific word.
SELECT Count(*)
FROM
( -- get a unique list of cited or citing papers
SELECT pid_from AS pid -- citing
FROM paper_citation_relation
UNION -- DISTINCT by default
SELECT pid_to -- cited
FROM paper_citation_relation
) AS dt
JOIN paper_word_relation AS wr
ON wr.pid = dt.pid
WHERE wr.word = 2 -- now check for the searched word
The number returned by this might be slightly higher (it counts a paper regardless if cited or citing).
I'm not sure I titled the question correctly, I'll try to explain what I mean by example.
I have the following query:
SELECT
d.name_ascii,
d.price,
t.whois_price
FROM domain d
JOIN tld t ON d.tld_id = t.id
JOIN user_tld ut ON t.id = ut.tld_id
WHERE ut.user_id = 1
AND d.name_ascii IN ('domain.com')
which returns the following:
+------------+---------+-------------+
| name_ascii | price | whois_price |
+------------+---------+-------------+
| domain.com | 321.000 | 15.000 |
+------------+---------+-------------+
It's ok, but I need next result:
+------------------+---------+
| item | price |
+------------------+---------+
| domain.com | 321.000 |
| domain.com whois | 15.000 |
+------------------+---------+
That is I need to remove 3-rd column named whois_price and insert it as another row with the same price value and concatenated name_ascii value with "whois" as item column.
I have no idea how to solve it. Any advice, hint?
Thanks!
One way to accomplish this would be with a UNION.
SELECT
d.name_ascii as item,
d.price
FROM domain d
WHERE d.name_ascii IN ('domain.com')
UNION ALL
SELECT
concat(d.name_ascii,' whois'),
t.whois_price
FROM domain d
JOIN tld t ON d.tld_id = t.id
JOIN user_tld ut ON t.id = ut.tld_id
WHERE ut.user_id = 1
AND d.name_ascii IN ('domain.com')
The first part gets price, the second part gets the whois price.
Here's a SQL Fiddle: http://sqlfiddle.com/#!9/d5164/3
I need to implement a function which returns all the networks the installation is not part of.
Following is my table and for example if my installation id is 1 and I need all the network ids where the installation is not part of then the result will be only [9].
network_id | installation_id
-------------------------------
1 | 1
3 | 1
2 | 1
2 | 2
9 | 2
2 | 3
I know this could be solved with a join query but I'm not sure how to implement it for the same table. This is what I've tried so far.
select * from network_installations where installation_id = 1;
network_id | installation_id
-------------------------------
1 | 1
2 | 1
3 | 1
select * from network_installations where installation_id != 1;
network_id | installation_id
-------------------------------
9 | 2
2 | 2
2 | 3
The intersection of the two tables will result the expected answer, i.e. [9]. But though we have union, intersect is not present in mysql. A solution to find the intersection of the above two queries or a tip to implement it with a single query using join will be much appreciated.
The best way to do this is to use a network table (which I presume exists):
select n.*
from network n
where not exists (select 1
from network_installation ni
where ni.network_id = n.network_id and
ni.installation_id = 1
);
If, somehow, you don't have a network table, you can replace the from clause with:
from (select distinct network_id from network_installation) n
EDIT:
You can do this in a single query with no subqueries, but a join is superfluous. Just use group by:
select ni.network_id
from network_installation ni
group by ni.network_id
having sum(ni.installation_id = 1) = 0;
The having clause counts the number of matches for the given installation for each network id. The = 0 is saying that there are none.
Another solution using OUTER JOIN:
SELECT t1.network_id, t1.installation_id, t2.network_id, t2.installation_id
FROM tab t1 LEFT JOIN tab t2
ON t1.network_id = t2.network_id AND t2.installation_id = 1
WHERE t2.network_id IS NULL
You can check at http://www.sqlfiddle.com/#!9/4798d/2
select *
from network_installations
where network_id in
(select network_id
from network_installations
where installation_id = 1
group by network_id )
I have a database with table xxx_facileforms_forms, xxx_facileforms_records and xxx_facileforms_subrecords.
Column headers for xxx_facileforms_subrecords:
id | record | element | title | neame | type | value
As far as filtering records with element = '101' ..query returns proper records, but when i add subquery to filete aditional element = '4871' from same table - 0 records returned.
SELECT
F.id AS form_id,
R.id AS record_id,
PV.value AS prim_val,
COUNT(PV.value) AS count
FROM
xxx_facileforms_forms AS F
INNER JOIN xxx_facileforms_records AS R ON F.id = R.form
INNER JOIN xxx_facileforms_subrecords AS PV ON R.id = PV.record AND PV.element = '101'
WHERE R.id IN (SELECT record FROM xxx_facileforms_records WHERE record = R.id AND element = '4871')
GROUP BY PV.value
Does this looks right?
Thank You!
EDIT
Thank you for support and ideas! Yes, I left lot of un guessing. Sorry. Some input/output table data might help make it more clear.
_facileforms_form:
id | formname
---+---------
1 | myform
_facileforms_records:
id | form | submitted
----+------+--------------------
163 | 1 | 2014-06-12 14:18:00
164 | 1 | 2014-06-12 14:19:00
165 | 1 | 2014-06-12 14:20:00
_facileforms_subrecords:
id | record | element | title | name|type | value
-----+--------+---------+--------+-------------+--------
5821 | 163 | 101 | ticket | radio group | flight
5822 | 163 | 4871 | status | select list | canceled
5823 | 164 | 101 | ticket | radio group | flight
5824 | 165 | 101 | ticket | radio group | flight
5825 | 165 | 4871 | status | select list | canceled
Successful query result:
form_id | record_id | prim_val | count
1 | 163 | flight | 2
So i have to return value data (& sum those records) from those records where _subrecord element - 4871 is present (in this case 163 and 165).
And again Thank You!
Thank You for support and ideas! Yes i left lot of un guessing.. sorry . So may be some input/output table data might help.
_facileforms_form:
headers -> id | formname
1 | myform
_facileforms_records:
headers -> id | form | submitted
163 | 1 | 2014-06-12 14:18:00
164 | 1 | 2014-06-12 14:19:00
165 | 1 | 2014-06-12 14:20:00
_facileforms_subrecords
headers -> id | record | element | title | name | type | value
5821 | 163 | 101 | ticket | radio group| flight
5822 | 163 | 4871 | status | select list | canceled
5823 | 164 | 101 | ticket | radio group | flight
5824 | 165 | 101 | ticket | radio group | flight
5825 | 165 | 4871 | status | select list | canceled
Succesful Query result:
headers -> form_id | record_id | prim_val | count
1 | 163 | flight | 2
So i have to return value data (& sum those records) from those records where _subrecord element - 4871 is present (in this case 163 and 165).
And again Thank You!
No, it doesn't look quite right. There's a predicate "R.id IN (subquery)" but that subquery itself has a reference to R.id; it's a correlated subquery. Looks like something is doubled up there. (We're assuming here that id is a UNIQUE or PRIMARY key in each table.)
The subquery references an identifier element... the only other reference we see to that identifier is from the _subrecords table (we don't see any reference to that column in _records table... if there's no element column in _records, then that's a reference to the element column in PV, and that predicate in the subquery will never be true at the same time the PV.element='101' predicate is true.
Kudos for qualifying the column references with a table alias, that makes the query (and the EXPLAIN output) much easier to read; the reader doesn't need to go digging around in the table definitions to figure out which table does and doesn't contain which columns. But please take that pattern to the next step, and qualify all column references in the query, including column references in the subqueries.
Since the reference to element isn't qualified, we're left to guess whether the _records table contains a column named element.
If the goal is to return only the rows from R with element='4871', we could just do...
WHERE R.element='4871'
But, given that you've gone to the bother of using a subquery, I suspect that's not really what you want.
It's possible you're trying to return all rows from R for a _form, but only for the _form where there's at least one associated _record with element='4871'. We could get that result returned with either an IN (subquery) or an EXISTS (correlated_ subquery) predicate, or an anti-join pattern. I'd give examples of those query patterns; I could take some guesses at the specification, but I would only be guessing at what you actually want to return.
But I'm guessing that's not really what you want. I suspect that _records doesn't actually contain a column named element.
The query is already restricting the rows returned from PV with those that have element='101'.)
This is a case where some example data and the example output would help explain the actual specification; and that would be a basis for developing the required SQL.
FOLLOWUP
I'm just guessing... maybe what you want is something pretty simple. Maybe you want to return rows that have element value of either '101' or '4913'.
The IN comparison operator is a convenient of way of expressing the OR condition, that a column be equal to a value in a list:
SELECT F.id AS form_id
, R.id AS record_id
, PV.value AS prim_val
, COUNT(PV.value) AS count
FROM xxx_facileforms_forms F
JOIN xxx_facileforms_records R
ON R.form = F.id
JOIN xxx_facileforms_subrecords PV
ON PV.record = R.id
AND PV.element IN ('101','4193')
GROUP BY PV.value
NOTE: This query (like the OP query) is using a non-standard MySQL extension to GROUP BY, which allows non-aggregate expressions (e.g. bare columns) to be returned in the SELECT list.
The values returned for the non-aggregate expressions (in this case, F.id and R.id) will be a values from a row included in the "group". But because there can be multiple rows, and different values on those rows, it's not deterministic which of values will be returned. (Other databases would reject this statement, unless we wrapped those columns in an aggregate function, such as MIN() or MAX().)
FOLLOWUP
I noticed that you added information about the question into an answer... this information would better be added to the question as an EDIT, since it's not an answer to the question. I took the liberty of copying that, and reformatting.
The example makes it much more clear what you are trying to accomplish.
I think the easiest to understand is to use EXISTS predicate, to check whether a row meeting some criteria "exists" or not, and exclude rows where such a row does not exist. This will use a correlated subquery of the _subrecords table, to which check for the existence of a matching row:
SELECT f.id AS form_id
, r.id AS record_id
, pv.value AS prim_val
, COUNT(pv.value) AS count
FROM xxx_facileforms_forms f
JOIN xxx_facileforms_records r
ON r.form = f.id
JOIN xxx_facileforms_subrecords pv
ON pv.record = r.id
AND pv.element = '101'
-- only include rows where there's also a related 4193 subrecord
WHERE EXISTS ( SELECT 1
FROM xxx_facileforms_subrecords sx
WHERE sx.element = '4193'
AND sx.record = r.id
)
--
GROUP BY pv.value
(I'm thinking this is where OP was headed with the idea that a subquery was required.)
Given that there's a GROUP BY in the query, we could actually accomplish an equivalent result with a regular join operation, to a second reference to the _subrecords table.
A join operation is often more efficient than using an EXISTS predicate.
(Note that the existing GROUP BY clause will eliminate any "duplicates" that might otherwise be introduced by a JOIN operation, so this will return an equivalent result.)
SELECT f.id AS form_id
, r.id AS record_id
, pv.value AS prim_val
, COUNT(pv.value) AS count
FROM xxx_facileforms_forms f
JOIN xxx_facileforms_records r
ON r.form = f.id
JOIN xxx_facileforms_subrecords pv
ON pv.record = r.id
AND pv.element = '101'
-- only include rows where there's also a related 4193 subrecord
JOIN xxx_facileforms_subrecords sx
ON sx.record = r.id
AND sx.element = '4193'
--
GROUP BY pv.value
I have this query
SELECT
shot.hole AS hole,
shot.id AS id,
(SELECT s.id FROM shot AS s
WHERE s.hole = shot.hole AND s.shot_number > shot.shot_number AND shot.round_id = s.round_id
ORDER BY s.shot_number ASC LIMIT 1) AS next_shot_id,
shot.distance AS distance_remaining,
shot.type AS hit_type,
shot.area AS onto
FROM shot
JOIN course ON shot.course_id = course.id
JOIN round ON shot.round_id = round.id
WHERE round.uID = 78
This returns 900~ rows in around 0.7 seconds. This is OK-ish, but there are more lines like this required
(SELECT s.id FROM shot AS s
WHERE s.hole = shot.hole AND s.shot_number > shot.shot_number AND shot.round_id = s.round_id
ORDER BY s.shot_number ASC LIMIT 1) AS next_shot_id,
For example
(SELECT s.id FROM shot AS s
WHERE s.hole = shot.hole AND s.shot_number < shot.shot_number AND shot.round_id = s.round_id
ORDER BY s.shot_number ASC LIMIT 1) AS past_shot_id,
Adding this increases the load time to 10s of seconds which is far too long and the page often doesn't load at all or MySQL just locks up and using show processlist shows that the query is just sat there sending data.
Removing the ORDER BY s.shot_number ASC clause in those sub queries reduces the query time down to 0.05 seconds which is much much better. But the ORDER BY is required to ensure that the next or past row (shot) is returned, rather than any old random row.
How can I improve this query to make it run faster and return the same results. Perhaps my approach for obtaining the next and past rows is sub optimal and I need to look at a different way of returning those next and previous row IDs?
EDIT - additional background info
The query was fine on my testing domain, a subdomain. But when moved to the live domain the issues started. Hardly anything was changed yet the whole site came to halt because of these new slow queries. Key notes:
Different domain
Different folder in /var/www
Same DB
Same DB credentials
Same code
Added indexes in an attempt to fix - this didn't help
Could any of these affected the load time?
This will get marked down in a minute for 'not being an answer', but it illustrates a possible solution without simply handing it to you on a plate....
SELECT * FROM ints;
+---+
| i |
+---+
| 0 |
| 1 |
| 2 |
| 3 |
| 4 |
| 5 |
| 6 |
| 7 |
| 8 |
| 9 |
+---+
SELECT x.i, MIN(y.i) FROM ints x LEFT JOIN ints y ON y.i > x.i GROUP BY x.i;
+---+----------+
| i | MIN(y.i) |
+---+----------+
| 0 | 1 |
| 1 | 2 |
| 2 | 3 |
| 3 | 4 |
| 4 | 5 |
| 5 | 6 |
| 6 | 7 |
| 7 | 8 |
| 8 | 9 |
| 9 | NULL |
+---+----------+
To expand on Strawberry's answer, doing additional left-join for a "pre-query" to get all the prior / next IDs, then join out to get whatever details you need.
select
Shot.ID,
Shot.Hole,
Shot.Distance as Distance_Remaining,
Shot.Type as Hit_Type,
Shot.Area as Onto
PriorShot.Hole as PriorHole,
PriorShot.Distance as PriorDistanceRemain,
NextShot.Hole as NextHole,
NextShot.Distance as NextDistanceRemain
from
( SELECT
shot.id,
MIN(nextshot.id) as NextShotID,
MAX(priorshot.id) as PriorShotID
FROM
round
JOIN shot
on round.id = shot.round_id
LEFT JOIN shot nextshot
ON shot.round_id = nextshot.round_id
AND shot.hole = nextshot.hole
AND shot.shot_number < nextshot.shot_number
LEFT JOIN shot priorshot
ON shot.round_id = priorshot.round_id
AND shot.hole = priorshot.hole
AND shot.shot_number > priorshot.shot_number
WHERE
round.uID = 78
GROUP BY
shot.id ) AllShots
JOIN Shot
on AllShots.id = Shot.ID
LEFT JOIN shot PriorShot
on AllShots.PriorShotID = PriorShot.ID
LEFT JOIN shot NextShot
on AllShots.NextShotID = NextShot.ID
The inner query gets only those for round.uID = 78, then you can join to the next / prior as needed. I did not add the joins to the course and round tables as no result columns were presented, but could easily be added.
I wonder how well the following performs. It replaces the joining operations with string operations.
SELECT shot.hole AS hole, shot.id AS id,
substring_index(substring_index(shots, ',', find_in_set(shot.id, ss.shots) + 1), ',', -1
) as nextsi,
substring_index(substring_index(shots, ',', find_in_set(shot.id, ss.shots) - 1), ',', -1
) as prevsi,
shot.distance AS distance_remaining, shot.type AS hit_type, shot.area AS onto
FROM shot JOIN
course
ON shot.course_id = course.id JOIN
round
ON shot.round_id = round.id join
(select s.round_id, s.hole, group_concat(s.id order by s.shot_number) as shots
from shot s
group by s.round_id, s.hole
) ss
on ss.round_id = shot.round_id and ss.hole = shot.hole
WHERE round.uID = 78
Note that this doesn't work fully -- it will produce erroneous results on the first and last shot. I'm wondering how the performance is before fixing those details.