Select data which have same letters - mysql

I'm having trouble with this SQL:
$sql = mysql_query("SELECT $menucompare ,
(COUNT($menucompare ) * 100 / (SELECT COUNT( $menucompare )
FROM data WHERE $ww = $button )) AS percentday FROM data WHERE $ww >0 ");
$menucompare is table fields names what ever field is selected and contains data bellow
$button is the week number selected (lets say week '6')
$ww table field name with row who have the number of week '6'
For example, I have data in $menucompare like that:
123456bool
521478bool
122555heel
147788itoo
and I want to select those, who have same word in the last of the data and make percentage.
The output should be like that:
bool -- 50% (2 entries)
heel -- 25% (1 entry)
itoo -- 25% (1 entry)
Any clearness to my SQL will be very appreciated.
I didn't find anything like that around.

Well, keeping data in such format probably not the best way, if possible, split the field into 2 separate ones.
First, you need to extract the string part from the end of the field.
if the length of the string / numeric parts is fixed, then it's quite easy;
if not, you should use regular expressions which, unfortunately, are not there by default with MySQL. There's a solution, check this question: How to do a regular expression replace in MySQL?
I'll assume, that numeric part is fixed:
SELECT s.str, CAST(count(s.str) AS decimal) / t.cnt * 100 AS pct
FROM (SELECT substr(entry, 7) AS str FROM data) AS s
JOIN (SELECT count(*) AS cnt FROM data) AS t ON 1=1
GROUP BY s.str, t.cnt;
If you'll have regexp_replace function, then substr(entry, 7) should be replaced to regexp_replace(entry, '^[0-9]*', '') to achieve the required result.
Variant with substr can be tested here.

When sorting out problems like this, I would do it in two steps:
Sort out the SQL independently of the presentation language (PHP?).
Sort out the parameterization of the query and the presentation of the results after you know you've got the correct query.
Since this question is tagged 'SQL', I'm only going to address the first question.
The first step is to unclutter the query:
SELECT menucompare,
(COUNT(menucompare) * 100 / (SELECT COUNT(menucompare) FROM data WHERE ww = 6))
AS percentday
FROM data
WHERE ww > 0;
This removes the $ signs from most of the variable bits, and substitutes 6 for the button value. That makes it a bit easier to understand.
Your desired output seems to need the last four characters of the string held in menucompare for grouping and counting purposes.
The data to be aggregated would be selected by:
SELECT SUBSTR(MenuCompare, -4) AS Last4
FROM Data
WHERE ww = 6
The divisor in the percentage is the count of such rows, but the sub-stringing isn't necessary to count them, so we can write:
SELECT COUNT(*) FROM Data WHERE ww = 6
This is exactly what you have anyway.
The divdend in the percentage will be the group count of each substring.
SELECT Last4, COUNT(Last4) * 100.0 / (SELECT COUNT(*) FROM Data WHERE ww = 6)
FROM (SELECT SUBSTR(MenuCompare, -4) AS Last4
FROM Data
WHERE ww = 6
) AS Week6
GROUP BY Last4
ORDER BY Last4;
When you've demonstrated that this works, you can re-parameterize the query and deal with the presentation of the results.

Related

How to Find First Valid Row in SQL Based on Difference of Column Values

I am trying to find a reliable query which returns the first instance of an acceptable insert range.
Research:
some of the below links adress similar questions, but I could get none of them to work for me.
Find first available date, given a date range in SQL
Find closest date in SQL Server
MySQL difference between two rows of a SELECT Statement
How to find a gap in range in SQL
and more...
Objective Query Function:
InsertRange(1) = (StartRange(i) - EndRange(i-1)) > NewValue
Where InsertRange(1) is the value the query should return. In other words, this would be the first instance where the above condition is satisfied.
Table Structure:
Primary Key: StartRange
StartRange(i-1) < StartRange(i)
StartRange(i-1) + EndRange(i-1) < StartRange(i)
Example Dataset
Below is an example User table (3 columns), with a set range distribution. StartRanges are always ordered in a strictly ascending way, UserID are arbitrary strings, only the sequences of StartRange and EndRange matters:
StartRange EndRange UserID
312 6896 user0
7134 16268 user1
16877 22451 user2
23137 25142 user3
25955 28272 user4
28313 35172 user5
35593 38007 user6
38319 38495 user7
38565 45200 user8
46136 48007 user9
My current Query
I am trying to use this query at the moment:
SELECT t2.StartRange, t2.EndRange
FROM user AS t1, user AS t2
WHERE (t1.StartRange - t2.StartRange+1) > NewValue
ORDER BY t1.EndRange
LIMIT 1
Example Case
Given the table, if NewValue = 800, then the returned answer should be 23137. This means, the first available slot would be between user3 and user4 (with an actual slot size = 813):
InsertRange(1) = (StartRange(i) - EndRange(i-1)) > NewValue
InsertRange = (StartRange(6) - EndRange(5)) > NewValue
23137 = 25955 - 25142 > 800
More Comments
My query above seemed to be working for the special case where StartRanges where tightly packed (i.e. StartRange(i) = StartRange(i-1) + EndRange(i-1) + 1). This no longer works with a less tightly packed set of StartRanges
Keep in mind that SQL tables have no implicit row order. It seems fair to order your table by StartRange value, though.
We can start to solve this by writing a query to obtain each row paired with the row preceding it. In MySQL, it's hard to do this beautifully because it lacks the row numbering function.
This works (http://sqlfiddle.com/#!9/4437c0/7/0). It may have nasty performance because it generates O(n^2) intermediate rows. There's no row for user0; it can't be paired with any preceding row because there is none.
select MAX(a.StartRange) SA, MAX(a.EndRange) EA,
b.StartRange SB, b.EndRange EB , b.UserID
from user a
join user b ON a.EndRange <= b.StartRange
group by b.StartRange, b.EndRange, b.UserID
Then, you can use that as a subquery, and apply your conditions, which are
gap >= 800
first matching row (lowest StartRange value) ORDER BY SB
just one LIMIT 1
Here's the query (http://sqlfiddle.com/#!9/4437c0/11/0)
SELECT SB-EA Gap,
EA+1 Beginning_of_gap, SB-1 Ending_of_gap,
UserId UserID_after_gap
FROM (
select MAX(a.StartRange) SA, MAX(a.EndRange) EA,
b.StartRange SB, b.EndRange EB , b.UserID
from user a
join user b ON a.EndRange <= b.StartRange
group by b.StartRange, b.EndRange, b.UserID
) pairs
WHERE SB-EA >= 800
ORDER BY SB
LIMIT 1
Notice that you may actually want the smallest matching gap instead of the first matching gap. That's called best fit, rather than first fit. To get that you use ORDER BY SB-EA instead.
Edit: There is another way to use MySQL to join adjacent rows, that doesn't have the O(n^2) performance issue. It involves employing user variables to simulate a row_number() function. The query involved is a hairball (that's a technical term). It's described in the third alternative of the answer to this question. How do I pair rows together in MYSQL?

MySQL match area code only when given the full number

I have a database that lists a few area codes, area code + office codes and some whole numbers and a action. I want it to return a result by the digits given but I am not sure how to accomplish it. I have some MySQL knowledge but its not very deep.
Here is a example:
match | action
_____________________
234 | goto 1
333743 | goto 2
8005551212| goto 3
234843 | goto 4
I need to query the database with a full 10 digit number -
query 8005551212 gives "goto 3"
query 2345551212 gives "goto 1"
query 3337431212 gives "goto 2"
query 2348431212 gives "goto 4"
This would be similar to the LIKE selection, but I need to match against the database value instead of the query value. Matching the full number is easy,
SELECT * FROM database WHERE `match` = 8005551212;
First the number to query will always be 10 digits, so I am not sure how to format the SELECT statement to differentiate the match of 234XXXXXXX and 234843XXXX, as I can only have one match return. Basically if it does not match the 10 digits, then it checks 6 digits, then it will check the 3 digits.
I hope this makes sense, I do not have any other way to format the number and it has to be accomplished with just a single SQL query and return over a ODCB connection in Asterisk.
Try this
SELECT match, action FROM mytable WHERE '8005551212' like concat(match,'%')
The issue is that you will get two rows in one case .. given your data..
SELECT action
FROM mytable
WHERE '8005551212' like concat(match,'%')
order by length(match) desc limit 1
That should get the row that had the most digits matched..
try this:
SELECT * FROM (
SELECT 3 AS score,r.* FROM mytable r WHERE match LIKE CONCAT(SUBSTRING('1234567890',1,3),'%')
UNION ALL
SELECT 6 AS score,r.* FROM mytable r WHERE match LIKE CONCAT(SUBSTRING('1234567890',1,6),'%')
UNION ALL
SELECT 10 AS score,r.* FROM mytable r WHERE match LIKE CONCAT(SUBSTRING('1234567890',1,10),'%')
) AS tmp
ORDER BY score DESC
LIMIT 1;
What ended up working -
SELECT `function`,`destination`
FROM reroute
WHERE `group` = '${ARG2}'
AND `name` = 0
AND '${ARG1}' LIKE concat(`match`,'%')
ORDER BY length(`match`) DESC LIMIT 1

Break Numbers List Into Min and Max Ranges

Brain is not working today and my google skills are failing me.
I have a column of numbers ranging from 1 - 1000. I want to dump the min and max values for 100 (or whatever I chose) record ranges into a temp table. The plan is to use this temp table to process ranges of records (in this example 100 at a time) in a larger table.
Swear I have done this before with a CTE but then I had something to group on. Here I just want to break up a single list of numbers into ranges of X.
The output from the temp table should look like:
Min Max
0 99
100 199
200 299
300 399
etc.
Thanks!
You can use this trick from Stuart Ainsworth:
http://codegumbo.com/index.php/2009/01/25/building-ranges-using-a-dynamically-generated-numbers-table/
Numbers tables are awesome, but he uses a dynamically generated numbers table, which is even awesome...r.
If you know all numbers are present in the source table, you can use a recursive CTE to generate the number ranges:
; with numbers as
(
select 0 as a
, 99 as b
union all
select a+100
, b+100
from numbers
where a < 900
)
select *
from numbers
If the source table is sparsely populated, you can limit it to numbers that are actually present like:
... insert CTE from above here ...
select min(ot.NumberColumn)
, max(ot.NumberColumn)
from numbers
left join
OtherTable ot
on ot.NumberColumn between numbers.a and numbers.b
group by
numbers.a
enter code hereI have been having a play with a CTE after you posted this and came up with the following, I would be interested to hear if it works for you at all.
DECLARE #segment int = 100
;
WITH _CTE
(rowNum, value)
AS
(
SELECT ROW_NUMBER() OVER(ORDER BY col01) -1, col01
FROM dbo.testTable
)
SELECT rowNum/#segment AS Bucket, MIN(Value) AS MinVal, MAX(Value) AS MaxVal
FROM _CTE
group by rowNum/#segment
ORDER BY Bucket
;
col01 in this case is the column that you want the min/max range values from, as is TestTable.

Combine 2 SELECTs into one SELECT in my RAILS application

I have a table called ORDEREXECUTIONS that stores all orders that have been executed. It's a multi currency application hence the table has two columns CURRENCY1_ID and CURRENCY2_ID.
To get a list of all orders for a specific currency pair (e.g. EUR/USD) I need to lines to get the totals:
v = Orderexecution.where("is_master=1 and currency1_id=? and currency2_id=? and created_at>=?",c1,c2,Time.now()-24.hours).sum("quantity").to_d
v+= Orderexecution.where("is_master=1 and currency1_id=? and currency2_id=? and created_at>=?",c2,c1,Time.now()-24.hours).sum("unitprice*quantity").to_d
Note that my SUM() formula is different depending on the the sequence of the currencies.
e.g. If I want the total ordered quantities of the currency pair USD it then executes (assuming currency ID for USD is 1 and EUR is 2.
v = Orderexecution.where("is_master=1 and currency1_id=? and currency2_id=? and created_at>=?",1,2,Time.now()-24.hours).sum("quantity").to_d
v+= Orderexecution.where("is_master=1 and currency1_id=? and currency2_id=? and created_at>=?",2,1,Time.now()-24.hours).sum("unitprice*quantity").to_d
How do I write this in RoR so that it triggers only one single SQL statement to MySQL?
I guess this would do:
v = Orderexecution.where("is_master=1
and ( (currency1_id, currency2_id) = (?,?)
or (currency1_id, currency2_id) = (?,?)
)
and created_at>=?"
,c1, c2, c2, c1, Time.now()-24.hours
)
.sum("CASE WHEN currency1_id=?
THEN quantity
ELSE unitprice*quantity
END"
,c1
)
.to_d
So you could do
SELECT SUM(IF(currency1_id = 1 and currency2_id = 2, quantity,0)) as quantity,
SUM(IF(currency2_id = 1 and currency1_id = 2, unitprice * quantity,0)) as unitprice _quantity from order_expressions
WHERE created_at > ? and (currency1_id = 1 or currency1_id = 2)
If you plug that into find_by_sql you should get one object back, with 2 attributes, quantity and unitprice_quantity (they won't show up in the output of inspect in the console but they should be there if you inspect the attributes hash or call the accessor methods directly)
But depending on your indexes that might actually be slower because it might not be able to use indexes as efficiently. The seemly redundant condition on currency1_id means that this would be able to use an index on [currency1_id, created_at]. Do benchmark before and after - sometimes 2 fast queries are better than one slow one!

MySQL: LIMIT by a percentage of the amount of records?

Let's say I have a list of values, like this:
id value
----------
A 53
B 23
C 12
D 72
E 21
F 16
..
I need the top 10 percent of this list - I tried:
SELECT id, value
FROM list
ORDER BY value DESC
LIMIT COUNT(*) / 10
But this doesn't work. The problem is that I don't know the amount of records before I do the query. Any idea's?
Best answer I found:
SELECT*
FROM (
SELECT list.*, #counter := #counter +1 AS counter
FROM (select #counter:=0) AS initvar, list
ORDER BY value DESC
) AS X
where counter <= (10/100 * #counter);
ORDER BY value DESC
Change the 10 to get a different percentage.
In case you are doing this for an out of order, or random situation - I've started using the following style:
SELECT id, value FROM list HAVING RAND() > 0.9
If you need it to be random but controllable you can use a seed (example with PHP):
SELECT id, value FROM list HAVING RAND($seed) > 0.9
Lastly - if this is a sort of thing that you need full control over you can actually add a column that holds a random value whenever a row is inserted, and then query using that
SELECT id, value FROM list HAVING `rand_column` BETWEEN 0.8 AND 0.9
Since this does not require sorting, or ORDER BY - it is O(n) rather than O(n lg n)
You can also try with that:
SET #amount =(SELECT COUNT(*) FROM page) /10;
PREPARE STMT FROM 'SELECT * FROM page LIMIT ?';
EXECUTE STMT USING #amount;
This is MySQL bug described in here: http://bugs.mysql.com/bug.php?id=19795
Hope it'll help.
I realize this is VERY old, but it still pops up as the top result when you google SQL limit by percent so I'll try to save you some time. This is pretty simple to do these days. The following would give the OP the results they need:
SELECT TOP 10 PERCENT
id,
value
FROM list
ORDER BY value DESC
To get a quick and dirty random 10 percent of your table, the following would suffice:
SELECT TOP 10 PERCENT
id,
value
FROM list
ORDER BY NEWID()
I have an alternative which hasn't been mentionned in the other answers: if you access from any language where you have full access to the MySQL API (i.e. not the MySQL CLI), you can launch the query, ask how many rows there will be and then break the loop if it is time.
E.g. in Python:
...
maxnum = cursor.execute(query)
for num, row in enumerate(query)
if num > .1 * maxnum: # Here I break the loop if I got 10% of the rows.
break
do_stuff...
This works only with mysql_store_result(), not with mysql_use_result(), as the latter requires that you always accept all needed rows.
OTOH, the traffic for my solution might be too high - all rows have to be transferred.