Need to ignore last two values in a lists column using Hive HQL - mysql

I have a column which contains all the values in lists.
Column A|Column B
AAA |1 2 45 67 89
BBB |16 25 36 45 89 63
CCC |52 63 98 41 22 66
Here in the above table, column B contains string values which are actually lists.
I need to ignore the first two and the last two values in Column B.
I tried using split function where i can ignore first two values. But ignoring last two values is the challenge as I have different sized lists.
The code which I used is:
select distinct column_A,column_B,split(column_B,'\\s')[2] AS ign_first_val,
split(column_B,'\\s')[-2] as ign_last_val
FROM Xyz
Is there any easy way to ignore first two and last two values in a list using HQL?

You should be able to use regexp_extract:
select regexp_extract(column_B, '^\\s*(\\d+\\s+){2}(.*?)(\\s+\\d+){2}\\s*$', 2)
The first part of the regex skips the first two values, and the last part skips the last two values, leaving just the middle part to be extracted into group 2 which is what is returned by the expression.
Here's a demo of the regex working on regex101.com

Related

How to sum only one of repeated values from joined data in RDLC

I'm not sure if SSRS is dumb, or I am (I'm leaning towards both).
I have a dataset that (as a result of joins etc) has some columns with the same values duplicated across every row (fairly standard database stuff):
rid cnt bid flg1 flg2
-------------------------------
4 2882 1 17 3
5 2784 1 17 3
6 1293 1 17 3
18 9288 2 4 9
20 762 2 4 9
Reporting based on cnt is straightforward enough. I can also make a tablix that shows the following:
bid flg1 flg2
------------------
1 17 3
2 4 9
(Where the tablix is grouped by Fields!bid.Value and the columns are just Fields!flg1.Value and Fields!flg2.Value respectively.)
What I can't figure out is how to display the sum of these values -- specifically I want to show that the sum of flg1 is 21 and the sum of flg2 is 12 -- not the sum of every row in the dataset (counting each value more than once).
(Note that I'm not looking for a sum of distinct values, as they may not be unique. I want a sum of one value from each bid group, because it's from a table join so they will always have the same value.)
If possible, I'd also like to be able to do a similar calculation at the top level of the report (not in any tablix); although I'd settle for hiding the detail row if that's the only way.
Obviously, Sum(Fields!flg1.Value) isn't the answer, as this either returns 51 (if on the first row inside the group) or 59 (if outside it).
I also tried Sum(Fields!flg1.Value, "bid") but this wasn't considered a valid scope.
I also tried Sum(First(Fields!flg1.Value, "bid")) but apparently you're not allowed to sum first values for some weird reason (and may have had the same scope problem anyway).
Using Sum(Max(Fields!flg1.Value, "bid")) does work, but feels wrong. Is there a better way to do this?
(Related: is there a good way to save the result of that calculation so that I can later also show a Sum of those totals without an even hairier expression?)
There are two basic ways to do this.
Do what you have already done (Sum(Max(Fields!flg1.Value, "bid")))
Sum the rendered values. To do this check the name of the cell containing the data you want (check it's properties) and then use something like =SUM(ReportItems!flg1.Value) where flg1 is the name of the textbox, which is not necessarily always the same name as the field.

Finding Reccurring Number Combinations in Column of Numbers

I have searched and found discussions and solutions to similar problems, but not quite or as complex as I'm trying to figure out.
I have an access table which consists of two columns Draw Number and Number Drawn as shown below. Draw Number is repeated 20 times, to correspond to the 20 numbers that are drawn in each particular draw.
I'm trying to figure a way to determine the most frequent occurring combination of numbers (5 numbers) for all of the draws in each of the 20 number sets. So for instance, 12341 occurs n x, 12342 occurs nx, 12343 occurs n x, etc.
I've created parameter queries which allow me to search for different number combinations from 2 to 10 numbers, and they work OK returning the number of occurrences of a combination of numbers that I input through a simple UI. But the goal is to figure out pragmatically what the optimum combination of numbers.
Hope this makes sense. And by the way, there are 36 million or so rows in the table. The para queries work quite well however; it takes just over a second to return results for each number added. So, query two numbers = 2 second wait, three numbers = 3 second wait, etc.
I've been thinking about a loop of some type but don't know how to get started? Processing time isn't an issue; can take a day if required!
This is written in VBA and has an assortment of queries, temp tables, etc to get the job done.
The text says Access, but the tags say MySql, which is it? – RBarryYoung 21 hours ago
This part confuses me: I'm trying to figure a way to determine the most frequent occurring combination of numbers (5 numbers) for all of the draws in each of the 20 number sets. So for instance, 12341 occurs n x, 12342 occurs nx, 12343 occurs n x, etc. – Newd 21 hours ago
^What do you mean five numbers? No where in your sample data do I see 12341. Please explain using the data you have, and give expected results using that data. – McAdam331 21 hours ago
drosberg - clarification:
thanks for the response. It is an Access application, but as a first-time poster Stackoverflow recommends tags?
By five numbers I mean the most frequently occurring group of five numbers (I used five as an example, could be groups of 2 to 10 numbers) which occur in each draw, where a draw consists of 20 drawn numbers from a total of 80 numbers. So the data that I posted was intended as an example. The sample provided only has 50, 51 in common. I can plug 50 and 51 into the parameter query and it will tell me that this combination occurs 60,000 times (or whatever), but perhaps 50 and 57 occurs 65,000 times.
If i was to do this manually, and assuming I'm looking for the most frequent 5 number combination I would enter the following in the parameter query: 1,2,3,4,1 group = 30,000 occurrences 1,2,3,4,2 group = 31,000 occurrences 1,2,3,4,3 group = 31,050 occurrences 1,2,3,4,4 group = 29,050 occurrences etc........... etc...........
but I would have to do this for every combination of 5 numbers that can be derived from the numbers 1 thru 80. I'm hoping to have program do the work!!
thanks
don
DRAW NUMBER NUMBER DRAWN
1 1
1 28
1 19
1 3
1 38
1 46
1 43
1 29
1 13
1 22
1 20
1 11
1 50
1 51
1 53
1 54
1 57
1 64
1 76
1 78
2 29
2 14
2 2
2 1
2 35
2 40
2 39
2 30
2 10
2 27
2 21
2 6
2 42
2 50
2 51
2 53
2 54
2 61
2 65
2 69
I wrote a post a while ago about generating permutations with and without repetition using Excel. Perhaps you can use it.
https://michiel.wordpress.com/2015/03/29/permutations-with-repetition-using-excel/
Here's how it works. I am using strings, but you can easily modify that for numbers (since you say you need 5).
You can use the MID function to grab a single char from a string, and generate permutations from it.
=MID(Pattern,MOD([N]/[P],Length)+1,1)
N revers to the column N
P refers to the horizontal row (1,4,16). You can generate these with a formula like =4^.
After putting in the code, you can make a list of all permutations in Excel and in the cell next to it generate a sql query that you can perform as well from VBA.
Example: Looking up Access database in Excel
Or find a commercial tool like http://thingiequery.com/
I don't know if there's any open source tools for it.
I'm thinking that you should consider:
Say there are 100 balls.
Setting up a table to have one row for each "Draw number" with 100 columns one for every possible number each column has type boolean.
When you look to see which draws had number 23 you just add a
WHERE Column23 = true.
For numbers 23 and 56
WHERE Column23 = true AND Column56 = true
This should massivel simplify and speed up your SQL.
You set up a table with every possible combination of numbers.
You run SQL to find the counts.
Harvey

Matching two sets of data, add rows when necessary

I am working on a large sets of data. Each set of data have three rows, my job is to match the first two rows. My difficulty is this, if row 1&2 doesn't match, I need to create a new row for next set of data. So in the beginning, the data looks like this:
A
1 00
2 A001
3 Y
4 00
5 A002
6 N
And as you can see, not only A1 & A4 needs to match, but also A2 & A5, so because A2 & A5 don't match, I need to create a new row for it. So the result will look like this:
A B
1 00
2 A001
3 Y
4 00
5 A002
6 N
I don't know any other way other than manually adding rows, and I need to combine 80 sets of data, so if anyone could help me with this, I will be so thankful!!
You say you have 80 sets of data. Does it means that you have 240 rows in your table? If so maybe creating a plsql table is a solution.
Each row of the plsql table has 3 columns (the 3 rows of 1 data set).
From the table you select each time 3 rows; and these rows you put into the plsql table.
When the plsql table is filled, then you use a loop in a loop (2 loops) through the plsql table. So you can compare each record from table 1 with the records of table2. And if they don't match you can do your thing.

Column 1 with names and Column 2 with values

Let's say I have a Text document. There are two columns. Column 1 contains a list of names while column contains a list of value relating to those names. The problem is column 1 may have same names repeating on different rows. This is not an error though.
For ex:
Frank Burton 13
Joe Donnigan 22
John Smith 45
Cooper White 53
Joe Donnigan 19
What are the ways to organize my data in a way that I would have column 1 with unique data names and column 2 with the values summed together relating column 1? What can I do if I have these data in excel?
For ex:
Frank Burton 13
Joe Donnigan 41
John Smith 45
Cooper White 53
Thanks a bunch!
In mySQL you could write a query similar to...
Select col1, Sum(col2) FROM TableName group by col1
In Excel you could use a pivot table to group the information together
Insert Pivot table, select range enter values as in image below.

how to push data down a row in sql results

I would like help with sql query code to push the consequent data in a specific column down by a row.
For example in a random table like the following,
x column y column
6 6
9 4
89 30
34 15
the results should be "pushed" down a row, meaning
x column y column
6 null or 0 (preferably)
9 6
89 4
34 30
SQL tables have no inherent concept of ordering. Hence, the concept of "next row" does not make sense.
Your example has no column that specifies the order for the rows. There is no definition of next. So, what you want to do cannot be done.
I am not aware of a simple way to do this with the way you are showing the table being formatted. If your perhaps added two consecutively numbered integer fields that provide row number and row number + 1 values, you could join the table to itself and get that information.
After taking a backup of you table:
Make a PHP function that will:
- Load all values of Y into an array
- Set Y = 0 (MYSQL UPDATE)
- load the values back from PHP array to MYSQL