Finding duplicates in ABAP internal table via grouping

Finding duplicates in ABAP internal table via grouping - duplicates

We all know these excellent ABAP statements which allows finding unique values in one-liner:
it_unique = VALUE #( FOR GROUPS value OF <line> IN it_itab
GROUP BY <line>-field WITHOUT MEMBERS ( value ) ).
But what about extracting duplicates? Can one utilize GROUP BY syntax for that task or, maybe, table comprehensions are more useful here?
The only (though not very elegant) way I found is:
LOOP AT lt_marc ASSIGNING FIELD-SYMBOL(<fs_marc>) GROUP BY ( matnr = <fs_marc>-matnr
werks = <fs_marc>-werks )
ASSIGNING FIELD-SYMBOL(<group>).
members = VALUE #( FOR m IN GROUP <group> ( m ) ).
IF lines( members ) > 1.
"throw error
ENDIF.
ENDLOOP.
Is there more beautiful way of finding duplicates by arbitrary key?

So, I just put it as answer, as we with Florian weren't able to think out something better. If somebody is able to improve it, just do it.
TYPES tt_materials TYPE STANDARD TABLE OF marc WITH DEFAULT KEY.
DATA duplicates TYPE tt_materials.
LOOP AT materials INTO DATA(material)
GROUP BY ( id = material-matnr
status = material-pstat
size = GROUP SIZE )
ASCENDING REFERENCE INTO DATA(group_ref).
CHECK group_ref->*-size > 1.
duplicates = VALUE tt_materials( BASE duplicates FOR <status> IN GROUP group_ref ( <status> ) ).
ENDLOOP.

Given
TYPES: BEGIN OF key_row_type,
matnr TYPE matnr,
werks TYPE werks_d,
END OF key_row_type.
TYPES key_table_type TYPE
STANDARD TABLE OF key_row_type
WITH DEFAULT KEY.
TYPES: BEGIN OF group_row_type,
matnr TYPE matnr,
werks TYPE werks_d,
size TYPE i,
END OF group_row_type.
TYPES group_table_type TYPE
STANDARD TABLE OF group_row_type
WITH DEFAULT KEY.
TYPES tt_materials TYPE STANDARD TABLE OF marc WITH DEFAULT KEY.
DATA(materials) = VALUE tt_materials(
( matnr = '23' werks = 'US' maabc = 'B' )
( matnr = '42' werks = 'DE' maabc = 'A' )
( matnr = '42' werks = 'DE' maabc = 'B' ) ).
When
DATA(duplicates) =
VALUE key_table_type(
FOR key IN VALUE group_table_type(
FOR GROUPS group OF material IN materials
GROUP BY ( matnr = material-matnr
werks = material-werks
size = GROUP SIZE )
WITHOUT MEMBERS ( group ) )
WHERE ( size > 1 )
( matnr = key-matnr
werks = key-werks ) ).
Then
cl_abap_unit_assert=>assert_equals(
act = duplicates
exp = VALUE tt_materials( ( matnr = '42' werks = 'DE') ) ).
Readability of this solution is so bad that you should only ever use it in a method with a revealing name like collect_duplicate_keys.
Also note that the statement's length increases with a growing number of key fields, as the GROUP SIZE addition requires listing the key fields one by one as a list of simple types.

What about the classics? I'm not sure if they are deprecated or so, but my first think is about to create a table clone, DELETE ADJACENT-DUPLICATES on it and then just compare both lines( )...
I'll be eager to read new options.

Related

How to remove an element in jsonb integer array in PostgreSQL

On Postgres, in a table called "photo" I have a jsonb column called "id_us" containing a json integer array, simply like this one [1,2,3,4]
I would like to find the query to remove the element 3 for example.
The closer I could get is this
SELECT jsonb_set(id_us, ''
, (SELECT jsonb_agg(val)
FROM jsonb_array_elements(p.id_us) x(val)
WHERE val <> jsonb '3')
) AS id_us
FROM photo p;
Any idea how to solve this?
Thank you!

You can use a subquery containing JSONB_AGG() function while filtering out the index value 3(by starting indexing from 1) such as
WITH p AS
(
SELECT JSONB_AGG(j) AS js
FROM photo
CROSS JOIN JSONB_ARRAY_ELEMENTS(id_us)
WITH ORDINALITY arr(j,idx)
WHERE idx != 3
)
UPDATE photo
SET id_us = js
FROM p
Demo
Edit : If you need to remove the value but not index as mentioned in the comment, just use the variable j casted as numeric
WITH p AS
(
SELECT JSONB_AGG(j) AS js
FROM photo
CROSS JOIN JSONB_ARRAY_ELEMENTS(id_us)
WITH ORDINALITY arr(j,idx)
WHERE j::INT != 18
)
UPDATE photo
SET id_us = js
FROM p
Demo
P.S: using JSONB_SET(), the comma-seperated place for the removed element along with quotes will still remain in such a way that in the following
WITH p AS
(
SELECT ('{'||idx-1||'}')::TEXT[] AS idx
FROM photo
CROSS JOIN JSONB_ARRAY_ELEMENTS(id_us)
WITH ORDINALITY arr(j,idx)
WHERE j::INT = 18
)
UPDATE photo
SET id_us = JSONB_SET(id_us,idx,'""')
FROM p;
SELECT * FROM photo;
id_us
-----------------
[127, 52, "", 44]

I've run across a similar issue, and it stems from the - operator. This operator is overloaded to accept either text or integer, but acts differently for each type. Using text will remove by value, and using an integer will remove by index. But what if your value IS an integer? Well then you're shit outta luck...
If possible, you can try changing your jsonb integer array to a jsonb string array (of integers), and then the - operator should work smoothly.
e.g.
'{1,2,3}' - 2 = '{1,2}' -- removes index 2
'{1,2,3}' - '2' = '{1,2,3}' -- removes values == '2' (but '2' != 2)
'{"1","2","3"}' - 2 = '{"1","2"}' -- removes index 2
'{"1","2","3"}' - '2' = '{"1","3"}' -- removes values == '2'

Query SQL database with JSON Value

Here is my JSON:
[{"Key":"schedulerItemType","Value":"schedule"},{"Key":"scheduleId","Value":"82"},{"Key":"scheduleEventId","Value":"-1"},{"Key":"scheduleTypeId","Value":"2"},{"Key":"scheduleName","Value":"Fixed Schedule"},{"Key":"moduleId","Value":"5"}]
I want to query the database by FileMetadata column
I've tried this:
SELECT * FROM FileSystemItems WHERE JSON_VALUE(FileMetadata, '$.Key') = 'scheduleId' and JSON_VALUE(FileMetadata, '$.Value') = '82'
but it doesn't work!
I had it working with just a dictionary key/value pair, but I needed to return the data differently, so I am adding it with key and value into the json now.
What am I doing wrong?

With the sample data given you'd have to supply an array index to query the 1th element (0-based array indexes), e.g.:
select *
from dbo.FileSystemItems
where json_value(FileMetadata, '$[1].Key') = 'scheduleId'
and json_value(FileMetadata, '$[1].Value') = '82'
If the scheduleId key can appear at arbitrary positions in the array then you can restructure the query to use OPENJSON instead, e.g.:
select *
from dbo.FileSystemItems
cross apply openjson(FileMetadata) with (
[Key] nvarchar(50) N'$.Key',
Value nvarchar(50) N'$.Value'
) j
where j.[Key] = N'scheduleId'
and j.Value = N'82'

How to create union of two different django-models?

I have two django-models
class ModelA(models.Model):
title = models.CharField(..., db_column='title')
text_a = models.CharField(..., db_column='text_a')
other_column = models.CharField(/*...*/ db_column='other_column_a')
class ModelB(models.Model):
title = models.CharField(..., db_column='title')
text_a = models.CharField(..., db_column='text_b')
other_column = None
Then I want to merge the two querysets of this models using union
ModelA.objects.all().union(ModelB.objects.all())
But in query I see
(SELECT
`model_a`.`title`,
`model_a`.`text_a`,
`model_a`.`other_column`
FROM `model_a`)
UNION
(SELECT
`model_b`.`title`,
`model_b`.`text_b`
FROM `model_b`)
Of course I got the exception The used SELECT statements have a different number of columns.
How to create the aliases and fake columns to use union-query?

You can annotate your last column to make up for column number mismatch.
a = ModelA.objects.values_list('text_a', 'title', 'other_column')
b = ModelB.objects.values_list('text_a', 'title')
.annotate(other_column=Value("Placeholder", CharField()))
# for a list of tuples
a.union(b)
# or if you want list of dict
# (this has to be the values of the base query, in this case a)
a.union(b).values('text_a', 'title', 'other_column')

In SQL query, we can use NULL to define the remaining columns/aliases
(SELECT
`model_a`.`title`,
`model_a`.`text_a`,
`model_a`.`other_column`
FROM `model_a`)
UNION
(SELECT
`model_b`.`title`,
`model_b`.`text_b`,
NULL
FROM `model_b`)

In Django, union operations needs to have same columns, so with values_list you can use those specific columns only like this:
qsa = ModelA.objects.all().values('text_a', 'title')
qsb = ModelB.objects.all().values('text_a', 'title')
qsa.union(qsb)
But there is no way(that I know of) to mimic NULL in union in Django. So there are two ways you can proceed here.
First One, add an extra field in your Model with name other_column. You can put the values empty like this:
other_column = models.CharField(max_length=255, null=True, default=None)
and use the Django queryset union operations as described in here.
Last One, the approach is bit pythonic. Try like this:
a = ModelA.objects.values_list('text_a', 'title', 'other_column')
b = ModelB.objects.values_list('text_a', 'title')
union_list = list()
for i in range(0, len(a)):
if b[i] not in a[i]:
union_list.append(b[i])
union_list.append(a[i])
Hope it helps!!

Error in SQL Stored Procedure Logic for a Report Using a Staff Name DropDown Filter

I have a report that shows the initial and most recent progress scores for a customer. One of the dropdown filters used is the 'Staff' filter that allows the user to select 'All' staff members or select a specific staff member to filter the report by.
The issue I'm encountering is that the initial and most recent scores are showing correctly when the 'Staff' dropdown filter is set to 'All', but it does not show correctly (the initial and most recent scores are the same for all clients) when the filter is set to a specific staff member's name.
As I'm not receiving an error, I assume the syntax is correct and the issue must be with the logic. I've included a snippet of the stored procedure below. Any help is appreciated. Thank you in advance.
JOIN person familymember on familymember.idfamily = family.id
LEFT JOIN capCase cc on cc.idFamilymember = familymember.id
LEFT JOIN #applicantFamily on #applicantFamily.idfamily = Family.id --and isnumeric(#classroom) = 1
WHERE main.idSSMatrix = #idSSMatrix
AND main1.StartDate between #dateFrom AND #dateTo
AND ( cast ( main.IdProgram as varchar ) = #idProgram OR isnumeric(#idProgram) <> 1 )
AND ( cast ( cc.idType as varchar ) = #idProgramCapCase OR isnumeric(#idProgramCapCase) <> 1 )
AND (cast( main.idstaff as varchar) = #staff OR isnumeric( #staff) <> 1)
AND (isnumeric(#applicantFamily.idfamily ) = 1 or (isnumeric(#classroom) <> 1 ))
AND main1.id is not null

SQL Query w/ multiple name / value pairs in Where clause

I have a table called Properties (pid, uid, pname, pvalue). The pid column is auto generated. Each uid (user id) could have multiple name value pairs stored in the pname and pvalue.
As an input I've multiple name value pairs for pname and pvalue which forms a complicated boolean expression.
For example: let's start w/ one name value pair. Say I want to retrieve all uid's whose 'favorite_color' is 'red'.
I wrote an SQL query:
SELECT *
FROM properties
WHERE ((pname = 'favorite_color') and (pvalue = 'red'))
The query soon gets complicated if I had to retrieve something like, fetch all uid's whose 'favorite_color' is 'red' or 'blue' and 'favorite_drink' is 'juice' or ' 'milk' and 'favorite_ hobby' is 'music' or 'art' etc.
I wrote an SQL query:
SELECT *
FROM properties
WHERE (((pname = 'favorite_color') and (pvalue = 'red'))
OR ((pname = 'favorite_color') and (pvalue = 'blue')))
AND (((pname = 'favorite_drink') and (pvalue = 'juice'))
OR ((pname = 'favorite_drink') and (pvalue = 'milk')))
AND (((pname = 'favorite_hobby') and (pvalue = 'music'))
OR ((pname = 'favorite_hobby') and (pvalue = 'art')))
I got the expression correct but unfortunately it fails because the evaluation is done on each row. What if I wanted to add more name value pairs to the where clause?
Questions:
Is it possible to write and SQL query for this?
The other idea I had was to fetch all the pname, pvalue pairs for each user, build a dynamic expression using an expression language and my input name value paris to evaluate it. I've apache's JEXL in mind.

To do the ANDs you need to do as many self-joins as you have and-ed conditions:
SELECT *
FROM Properties p1, Properties p2, Properties p3
WHERE p1.uid = p2.uid AND p1.uid = p3.uid
AND (p1.pname = 'favorite_color' AND p1.pvalue IN ('red', 'blue'))
AND (p2.pname = 'favorite_drink' AND p2.pvalue IN ('juice', 'milk'))
AND (p3.pname = 'favorite_hobby' AND p2.pvalue IN ('music', 'art'))
EDIT:
Another possibility is to denormalize the data, and then use FIND_IN_SET() or RLIKE:
SELECT uid, group_concat(concat(pname, '=', pvalue)) props
FROM Properties
GROUP BY uid
HAVING props RLIKE 'favorite_color=(red|blue)'
AND props RLIKE 'favorite_drink=(juice|milk)'
AND props RLIKE 'favorite_hobby=(music|art)'

SELECT p1.*,p2.pname,p2.pvalue,p3.pname,p3.pvalue
FROM (
(Select *
from Properties
where (pname = 'favorite_color' AND pvalue IN ('red', 'blue')) p1,
(Select *
from Properties
where (pname = 'favorite_drink' AND pvalue IN ('juice','milk')) p2,
(Select *
from Properties
where (pname = 'favorite_hobby' AND pvalue IN ('music', 'art')) p3,
)
WHERE p1.uid = p2.uid AND p1.uid = p3.uid

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008

Finding duplicates in ABAP internal table via grouping - duplicates

What about the classics? I'm not sure if they are deprecated or so, but my first think is about to create a table clone, DELETE ADJACENT-DUPLICATES on it and then just compare both lines( )... I'll be eager to read new options.

Related

How to remove an element in jsonb integer array in PostgreSQL

Query SQL database with JSON Value

How to create union of two different django-models?

Error in SQL Stored Procedure Logic for a Report Using a Staff Name DropDown Filter

SQL Query w/ multiple name / value pairs in Where clause

Categories

Resources