Count occurrences of a sub string in a MySQL column

Count occurrences of a sub string in a MySQL column - mysql

I have a table which stores information of a lot of twitter tweets including the tweet text and the screen name of the user who tweeted the tweet. The tweets contain hashtags (starting with #), I want to count the number of hashtags that a specific user has tweeted:
tweet_id | tweet_text | screen_name |
--------------------------------------------------------------------------------------------
1 | #hashtag1 #otherhashtag2 #hashtag3 some more text | tweeter_user_1 |
2 | some text #hashtag1 #hashtag4 more text | tweeter_user_2 |
3 | #hashtag5 #hashtag1 #not a hashtag some#nothashtag | tweeter_user_1 |
4 | #hashtag1 with more text | tweeter_user_3 |
5 | #otherhashtag2 #hashtag3,#hashtag4 more text | tweeter_user_1 |
If I were to count the hashtags of tweeter_user_1, the result i expect is 8, if i wanted the hashtags of tweeter_user_3 it should return 1. How can I do it assuming that my table name is tweets.
I tried this: SELECT COUNT( * ) FROM tweets WHERE( LENGTH( REPLACE( tweet_text, '#%', '#') = 0 ) ) AND screen_name = 'tweeter_user_1' but it didn't work
I would be happy if the result of tweeter_user_1 was 9 too :D

This should give you a list of screen_names and the total count of all hashtags they use.
SELECT foo.screen_name, SUM(foo.counts) FROM
(
SELECT screen_name,
LENGTH( tweet_text) - LENGTH(REPLACE(tweet_text, '#', '')) AS counts
FROM tweet_table
) as foo
GROUP BY foo.screen_name
But.... it's a nasty query if the table is huge. I might specify a specific users in the inner select if you just need counts for a single user. Like this:
SELECT foo.screen_name, SUM(foo.counts) FROM
(
SELECT screen_name,
LENGTH( tweet_text) - LENGTH(REPLACE(tweet_text, '#', '')) AS counts
FROM tweet_table WHERE screen_name = 'tweeter_user_1'
) as foo
GROUP BY foo.screen_name

Depending on how often you need to run the query, you could be causing MySQL to spend a lot of CPU time parsing and reparsing the tweet_text column. I would strongly recommend adding a hashtag_qty column (or similar) and store the count of hashtag elements there when you populate the row to begin with.

Related

ASCII sum of all the all the characters in column Mysql

I have a table users but i have shown only 2 columns I want to sum all the characters of name column.
+----+-------+
| id | name |
+----+-------+
| 0 | user |
| 1 | admin |
| 3 | edit |
+----+-------+
for example ascii sum of user will be
sum(user)=117+115+101+114=447
i have tired this
SELECT ASCII(Substr(name, 1,1)) + ASCII(Substr(name, 2, 1)) FROM user
but it only sums 2.

You are going to have to fetch one character at a time to do the sum. One method is to write a function with a while loop. You can do this with a SELECT, if you know the longest string:
SELECT name, SUM(ASCII(SUBSTR(name, n, 1)))
FROM user u JOIN
(SELECT 1 as n UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL
SELECT 4 UNION ALL SELECT 5 -- sufficient for your examples
) n
ON LENGTH(name) <= n.n
GROUP BY name;
If your goal is to turn the string as something that can be easily compared or a fixed length, then you might consider the encryption functions in MySQL. Adding up the ASCII values is not a particularly good hash function (because strings with the same characters in different orders produce the same value). At the very least, multiplying each ASCII value by the position is a bit better.

Distinct number of specific items in list

I rarely do stuff in MySQL, so for me this is rocket science ...
I want to know how many times distinct values starting with "abc-" are present in a list.
So for example how many times "abc-table" and "abc-sofa" are present.
The table:
| object
-----------
| abc-table
| def-table
| ghi-chair
| abc-sofa
| abc-table
The result should be like:
| name number
-------------------
| abc-table 2
| abc-sofa 1
(Excuse me for the badly formatted tables.)
I tried the following, but that turns out to be incorrect:
SELECT object, COUNT(DISTINCT object) WHERE object LIKE abc-% FROM table GROUP BY object
Any help is appreciated.

WHERE clause should be after FROM.
Use single quote ' for the LIKE operator.
No need of DISTINCT in your case.
Try the below query:
SELECT `object` AS `name`, COUNT(`object`) AS `number`
FROM table
WHERE `object` LIKE 'abc-%'
GROUP BY `object`
ORDER BY COUNT(`object`) DESC; -- add order by if you need to sort by count
Result:
name number
----------------
abc-table 2
abc-sofa 1
DEMO

Use count(*), groupt by , like 'abc-%' and having
SELECT object, COUNT(*)
FROM table
WHERE object LIKE 'abc-%'
group by object
having count(*) >=1

MySQL search query ordered by match relevance

I know basic MySQL querying, but I have no idea how to achieve an accurate and relevant search query.
My table look like this:
id | kanji
-------------
1 | 一子
2 | 一人子
3 | 一私人
4 | 一時
5 | 一時逃れ
I already have this query:
SELECT * FROM `definition` WHERE `kanji` LIKE '%一%'
The problem is that I want to order the results from the learnt characters, 一 being a required character for the results of this query.
Say, a user knows those characters: 人,子,時
Then, I want the results to be ordered that way:
id | kanji
-------------
2 | 一人子
1 | 一子
4 | 一時
3 | 一私人
5 | 一時逃れ
The result which matches the most learnt characters should be first. If possible, I'd like to show results that contain only learnt characters first, then a mix of learnt and unknown characters.
How do I do that?

Per your preference, ordering by number of unmatched characters (increasing), and then number of matched character (decreasing).
SELECT *,
(kanji LIKE '%人%')
+ (kanji LIKE '%子%')
+ (kanji LIKE '%時%') score
FROM kanji
ORDER BY CHAR_LENGTH(kanji) - score, score DESC
Or, the relational way to do it is to normalize. Create the table like this:
kanji_characters
kanji_id | index | character
----------------------------
1 | 0 | 一
1 | 1 | 子
2 | 0 | 一
2 | 1 | 人
2 | 2 | 子
...
Then
SELECT kanji_id,
COUNT(*) length,
SUM(CASE WHEN character IN ('人','子','時') THEN 1 END) score
FROM kanji_characters
WHERE index <> 0
AND kanji_id IN (SELECT kanji_id FROM kanji_characters WHERE index = 0 AND character = '一')
GROUP BY kanji_id
ORDER BY length - score, score DESC
Though you didn't specify what should be done in the case of duplicate characters. The two solutions above handle that differently.

Just a thought, but a text index may help, you can get a score back like this:
SELECT match(kanji) against ('your search' in natural language mode) as rank
FROM `definition` WHERE match(`kanji`) against ('your search' in natural language mode)
order by rank, length(kanji)
The trick is to index these terms (or words?) the right way. I think the general trick is to encapsulate each word with double quotes and make a space between each. This way the tokenizer will populate the index the way you want. Of course you would need to add/remove the quotes on the way in/out respectively.
Hope this doesn't bog you down.

mysql SORT BY amount of unique word matches

I've found many questions that ask for amount of appearences, but none that ask the very same as I wish to do.
A dynamically generated (prepared-statement) query will result in something like this:
SELECT * FROM products WHERE
( title LIKE ? AND title LIKE ? ) AND
( content LIKE ? OR content LIKE ? ) AND
( subtitle LIKE ? AND author LIKE ? )
ORDER BY relevance LIMIT ?,?
The amount of words entered, (and so the amount of LIKE) are for title,content and author a variable amount (depending on the search query).
Now I've added a ORDER BY relevance. But I wish this order to be the amount of unique words from the content-field that match. (Note: Not on the amount of appearences, but on the amount of entered strings in the content column that have at least one match).
Example table products:
id | title | subtitle | content
------------------------------------
1 | animals | cat | swim swim swim swim swim swim swim
2 | canimal | fish | some content
3 | food | roasted | some content
4 | animal | cat | swim better better swims better something else
5 | animal | cat | dogs swim better
Example query (with prepared statements ? filled in):
SELECT * FROM products WHERE
( title LIKE %animal% ) AND
( content LIKE %dog% OR content LIKE %swim% OR content LIKE %better% ) AND
( subtitle LIKE %cat% )
ORDER BY relevance LIMIT 0,10
Expected results (in correct order!):
id | amount of matches
-----------------
5 | 3 (dog, swim, better)
4 | 2 (swim, better)
1 | 1 (swim)
I have an Innodb table and mysql version lower than 5.6, therefore I can't use MATCH...AGAINST.
I was thinking this could be solved with WHEN CASE ... THEN. But I have no idea how I could create this sorting.

You can do it in many ways for example
ORDER BY SIGN(LOCATE('dog',content))+
SIGN(LOCATE('swim',content))+
SIGN(LOCATE('better',content)) DESC
SQLFiddle demo
or with CASE
ORDER BY
CASE WHEN content LIKE '%dog%'
THEN 1
ELSE 0
END
+
CASE WHEN content LIKE '%swim%'
THEN 1
ELSE 0
END
+
CASE WHEN content LIKE '%better%'
THEN 1
ELSE 0
END
DESC

Check like this.
SELECT id,CONCAT_WS('-',COUNT(LENGTH(content) - LENGTH(REPLACE(content, ' ', '')) + 1),REPLACE(content,' ',',')) AS amount of matches FROM products
WHERE
( title LIKE %animal% ) AND
( content LIKE %dog% OR content LIKE %swim% OR content LIKE %better% ) AND
( subtitle LIKE %cat% )
GROUP BY id
ORDER BY id

Select random data with DataMapper

Im trying to select random datasets with DataMapper, but seems like there is no such function support.
For example, i have set of data:
+-------------------+
| ID | Name | Value |
+-------------------+
| 1 | T1 | 123 |
| 2 | T2 | 456 |
| 3 | T3 | 789 |
| 4 | T4 | 101 |
| ----------------- |
| N | Tn | value |
There can be a lot of data, more than 100k rows.
And i need to map data to object:
class Item
include DataMapper::Resource
property :id, Serial
property :name, String
property :value, String
end
So, the question is: How to select random data from table?
Similar query in SQL will be:
SELECT id, name, value FROM table ORDER BY RAND() LIMIT n;

A long time after the OP, but since this is the first google hit for "datamapper random row"...
Using pure DataMapper, and without making assumptions about continuous IDs, etc, you can do:
Item.first(:offset => rand(Item.count))
which results in the queries:
SELECT COUNT(*) FROM `items`
SELECT <fields> FROM `items` ORDER BY `id` LIMIT 1 OFFSET <n>
If you'd prefer a single query, at the cost of potentially reduced speed, you can do:
Item.all.sample
while results in:
SELECT <fields> FROM `items` ORDER BY `id`
Obviously, wrap this in a transaction if you need to.

I generally don't care literally retrieving random records. In this case, I use a slighttly different paradigm.
ORDER BY value // or value mod some number // you could also use name, or some function on the name
SELECT LIMIT n OFFSET k
where k is a random number generated in your code less than N-n. Sufficiently random for most cases, even though the records are somewhat contiguous in what you use for ORDER BY.

You could generate a random number x < number_of_rows, and just fetch that id.
You could also try entering the SQL directly, like this:
find_by_sql(<<-SQL
SELECT `id`, `name`, `value` FROM table ORDER BY RAND() LIMIT n;
SQL, :properties => property_set)
You need to specify :properties though, for it to map with your property set.

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008

Count occurrences of a sub string in a MySQL column - mysql

Related

ASCII sum of all the all the characters in column Mysql

Distinct number of specific items in list

MySQL search query ordered by match relevance

mysql SORT BY amount of unique word matches

Select random data with DataMapper

Categories

Resources