I'm having a small issue with a query using ANY.
Select *, count(*) as m
from mp_bigrams_raw
where date_parsed=051213
and art_source='f'
and bigram != ANY(select feed_source from mp_feed_sources)
group by bigram
order by m DESC
limit 50;
The query runs but it's not excluding the items found in the subquery.
The original query worked when there was only 1 row in the subquery. Once I added more I got an error about more than 1 row.
Select *, count(*) as m
from mp_bigrams_raw
where date_parsed=051213
and art_source='f'
and bigram != (select feed_source from mp_feed_sources)
group by bigram
order by m DESC
limit 50;
From there I added ANY and the query runs but seems to ignore the !=. I'm guessing I'm missing something here.
Thanks
Why don't you use NOT IN
Select *, count(*) as m
from mp_bigrams_raw
where date_parsed=051213
and art_source='f'
and bigram NOT IN(select feed_source from mp_feed_sources)
group by bigram
order by m DESC
limit 50;
Try using a left join with an is null:
Select r.*, count(*) as m
from mp_bigrams_raw r
left join mp_feed_sources f on f.feed_source = r.bigram
where r.date_parsed=051213
and r.art_source='f'
and f.feed_source is null
group by r.bigram
order by m DESC
limit 50;
The condition ANY returns true (as far as the doc says), whenever the condition is true for any of the entries of the subselect, so if one of those selected fields is != bigram the clause evaluates to true.
NOT IN is what you want, so bigram is not in the list of selected values.
Related
I wont to get non zero funnel_id count. i get funnel_id count but is also show count of funnel_id is zero and here we do not add where clause here i also get page_count in this query.
SELECT `smart_projects`.project_id, `smart_projects`.business_id, `smart_projects`.title,
`page_pages`.`funnel_id` as `funnel_id`, count(distinct(page_pages.page_id) )as page_count, count(distinct(page_pages.funnel_id) )as funnel_count
FROM `smart_projects`
LEFT JOIN `page_pages` ON `smart_projects`.`project_id` = `page_pages`.`project_id`
WHERE smart_projects.status != 0
AND `smart_projects`.`business_id` = 'cd9412774edb11e9'
AND `smart_projects`.`created_date` BETWEEN 1558031400 AND 1558722600
GROUP BY `smart_projects`.`project_id`
ORDER BY `funnel_count` ASC
LIMIT 10
page_pages table is :
smart_projects table is :
result is :-
Expected Result is :
SELECT `smart_projects`.project_id, `smart_projects`.business_id, `smart_projects`.title,
`page_pages`.`funnel_id` as `funnel_id`, count(distinct(page_pages.page_id) )as page_count, count(distinct (CASE WHEN page_pages.funnel_id != 0 then page_pages.funnel_id ELSE NULL END ) ) as funnel_count
FROM `smart_projects`
LEFT JOIN `page_pages` ON `smart_projects`.`project_id` = `page_pages`.`project_id`
WHERE smart_projects.status != 0
AND `smart_projects`.`business_id` = 'cd9412774edb11e9'
GROUP BY `smart_projects`.`project_id`
ORDER BY `title` DESC
If you want to filter out zero values, using a having clause:
SELECT sp.project_id, sp.business_id, sp.title,
count(distinct pp.page_id ) as page_count,
count(distinct pp.funnel_id ) as funnel_count
FROM `smart_projects` sp LEFT JOIN
`page_pages` pp
ON sp.`project_id` = pp.`project_id`
WHERE sp.status <> 0 AND
sp.`business_id` = 'cd9412774edb11e9' AND
sp.`created_date` BETWEEN 1558031400 AND 1558722600
GROUP BY sp.`project_id`
HAVING funnel_count > 0
ORDER BY `funnel_count` ASC
LIMIT 10;
Notes:
Table aliases make the query easier to write and to read.
DISTINCT is not a function, it is a keyword. The following expression does not need parentheses.
funnel_id is not appropriate in the SELECT, because it is the argument to an aggregation function.
What I am trying to do it with below code, getting all keywords with their positions via LEFT JOIN, it works fine but it shows the first position of each keyword, but I want to show the last position that recorded (by date).
SELECT keyword.id, keyword.title, keyword.date, rank.position FROM keyword
LEFT JOIN rank
ON rank.wordid = keyword.id
GROUP BY keyword.id
ORDER BY keyword.date DESC
How can I do this? Should I use subquery or what? Is there any way to do this without a subquery?
SAMPLE DATA
What I want:
Get 17 instead of 13, I mean last record of position.
Do not use group by for this! You want to filter, so use a where clause. In this case, using a correlated subquery works well:
SELECT k.id, k.title, k.date, r.position
FROM keyword k LEFT JOIN
rank r
ON r.wordid = k.id AND
r.date = (SELECT MAX(r2.date)
FROM rank r2
WHERE r2.wordid = k.id
)
ORDER BY k.date DESC
You can use below query
SELECT keyword.id, keyword.title, keyword.date, rankNew.position FROM keyword LEFT JOIN (
SELECT rank.wordid, rank.position FROM rank ORDER BY rank.id DESC LIMIT 0, 1) AS rankNew ON (rankNew.wordid = keyword.id);
You can get more reference from Retrieving the last record in each group - MySQL
Lets say I have a list of url's and I want to find out the url that is the most unique. I mean which is appearing the fewest. Here is an example of the database:
3598 ('www.emp.de/blog/tag/fear-factory/',)
3599 ('www.emp.de/blog/tag/white-russian/',)
3600 ('www.emp.de/blog/musik/die-emp-plattenkiste-zum-07-august-2015/',)
3601 ('www.emp.de/Warenkorb/car_/',)
3602 ('www.emp.de/ter_dataprotection/',)
3603 ('hilfe.monster.de/my20/faq.aspx#help_1_211589',)
3604 ('jobs.monster.de/l-nordrhein-westfalen.aspx',)
3605 ('karriere-beratung.monster.de',)
3606 ('karriere-beratung.monster.de',)
In this case it should return jobs.monster.de or hilfe.monster.de. I only want one return value. Is that possible with pure mysql?
It should be some kind of counting of the main url before the ".de"
At this moment I do it this way:
con.execute("select url, date from urls_to_visit ORDER BY RANDOM() LIMIT 1")
You could join the table on itself where ID's are not identical and count those, Then order by descending order and limit to 1 result.
not checked.
SELECT COUNT(*) as hitcount,
SUBSTRING_INDEX(t1.`url`,'.',2) as url
FROM table t1
INNER JOIN table t2 ON
SUBSTRING_INDEX(t1.`url`,'.',2) = SUBSTRING_INDEX(t2.`url`,'.',2)
AND t1.id <> t2.id
GROUP BY SUBSTRING_INDEX(t1.`url`,'.',2)
ORDER BY hitcount ASC
LIMIT 1
EDIT
Just checked on this, and it doesn't quite work.
I came up with this alternative, which uses a subquery to group all the domains together and get a count.
SELECT subq.count as hitcount,SUBSTRING_INDEX(t1.`url`,'.',2) as domain
FROM hits t1
INNER JOIN
(SELECT COUNT(*) as count,
SUBSTRING_INDEX(`url`,'.',2) as domain
FROM hits GROUP BY SUBSTRING_INDEX(`url`,'.',2)
) subq
ON subq.domain = SUBSTRING_INDEX(t1.`url`,'.',2)
GROUP BY SUBSTRING_INDEX(t1.`url`,'.',2)
ORDER BY hitcount ASC
LIMIT 1
working fiddle
Given your sample data (ignoring the parentheses, because I have no idea what those are doing), this query should do what you want:
select substring_index(url, '.', 2) as domain, count(*) as cnt
from table t
group by substring_index(url, '.', 2)
order by cnt desc
limit 1;
Is it possible to remove the subquery from this SQL? I need to order by the "match against" score, but obviously can't order by the alias.
SELECT *
FROM
(SELECT b.shortDesc,
b.img,
sm.uri,
match(`bodyCopy`, `shortDesc`) against ('Storage' IN NATURAL LANGUAGE MODE WITH QUERY EXPANSION) AS score
FROM `blog` b
JOIN `sitemap` sm ON sm.id = b.pageId
WHERE 'Active' IN (b.status, sm.status)
) t1
WHERE score > 0
ORDER BY score DESC
You can get rid of the subquery:
SELECT b.shortDesc,
b.img,
sm.uri,
match(`bodyCopy`, `shortDesc`) against ('Storage' IN NATURAL LANGUAGE MODE WITH QUERY EXPANSION) AS score
FROM `blog` b
JOIN `sitemap` sm ON sm.id = b.pageId
WHERE 'Active' IN (b.status, sm.status)
HAVING score > 0
ORDER BY score DESC
The value of score is not known before the results are computed. It means it can't be used in the WHERE clause. Instead, it can be used in HAVING. ORDER being applied last, it is possible to use score here.
Documentation: SELECT (also for HAVING and ORDER)
Here:
SELECT b.shortDesc,
b.img,
sm.uri,
match(`bodyCopy`, `shortDesc`) against ('Storage' IN NATURAL LANGUAGE MODE WITH
QUERY EXPANSION) AS score
FROM `blog` b
JOIN `sitemap` sm ON sm.id = b.pageId
WHERE 'Active' IN (b.status, sm.status) and score > 0
ORDER BY score DESC
I have the following query, but after some time when users start putting in more and more items in the "ci_falsepositives" table, it gets really slow.
The ci_falsepositives table contains a reference field from ci_address_book and another reference field from ci_matched_sanctions.
How can I create a new query but still being able to sort on each field.
For example I can still sort on "hits" or "matches"
SELECT *, matches - falsepositives AS hits
FROM (SELECT c.*, IFNULL(p.total, 0) AS matches,
(SELECT COUNT(*)
FROM ci_falsepositives n
WHERE n.addressbook_id = c.reference
AND n.sanction_key IN
(SELECT sanction_key FROM ci_matched_sanctions)
) AS falsepositives
FROM ci_address_book c
LEFT JOIN
(SELECT addressbook_id, COUNT(match_id) AS total
FROM ci_matched_sanctions
GROUP BY addressbook_id) AS p
ON c.id = p.addressbook_id
) S
ORDER BY folder asc, wholename ASC
LIMIT 0,15
The problem has to be the SELECT COUNT(*) FROM ci_falsepositives sub-query. That sub-query can be written using an inner join between ci_falsepositives and ci_matched_sanctions, but the optimizer might do that for you anyway. What I think you need to do, though, is make that sub-query into a separate query in the FROM clause of the 'next query out' (that is, SELECT c.*, ...). Probably, that query is being evaluated multiple times - and that's what's hurting you when people add records to ci_falsepositives. You should study the query plan carefully.
Maybe this query will be better:
SELECT *, matches - falsepositives AS hits
FROM (SELECT c.*, IFNULL(p.total, 0) AS matches, f.falsepositives
FROM ci_address_book AS c
JOIN (SELECT n.addressbook_id, COUNT(*) AS falsepositives
FROM ci_falsepositives AS n
JOIN ci_matched_sanctions AS m
ON n.sanction_key = m.sanction_key
GROUP BY n.addressbook_id
) AS f
ON c.reference = f.addressbook_id
LEFT JOIN
(SELECT addressbook_id, COUNT(match_id) AS total
FROM ci_matched_sanctions
GROUP BY addressbook_id) AS p
ON c.id = p.addressbook_id
) AS s
ORDER BY folder asc, wholename ASC
LIMIT 0, 15