Related
I want to achieve following - in a csv file there are records (lines) with comma-separated values of arbitrary length, then I want to pass to the parametrized test method first N (say, 3, but whatever) arguments a String, and the rest - as some collection. That said, I want to achieve something like this:
class Tests {
#DisplayName("Data Test")
#ParameterizedTest(name = "{0} → {1}; {2} → {3}")
#CsvFileSource(resources = ["/data.csv"], numLinesToSkip = 1)
fun runTests(spec0: String, spec1: String, input: String, outputs: List<String>) {
assertData(spec0, spec1, input, outputs)
}
}
However, I actually don't know what it the best way to do it. The current workaround I'm using is to just store dynamic length values as a single string with some separator and postprocess the last argument:
class Tests {
#DisplayName("Data Test")
#ParameterizedTest(name = "{0} → {1}; {2} → {3}")
#CsvFileSource(resources = ["/data.csv"], numLinesToSkip = 1)
fun runTests(spec0: String, spec1: String, input: String, outputs: String) {
assertData(spec0, spec1, input, outputs.split('␞'))
}
}
What would be the best (more idiomatic) way to achieve this?
I just don't want have data in csv file with this additional separator.
I have a json object like so:
{
_id: "12345",
identifier: [
{
value: "1",
system: "system1",
text: "text!"
},
{
value: "2",
system: "system1"
}
]
}
How can I use the XDevAPI SearchConditionStr to look for the specific combination of value + system in the identifier array? Something like this, but this doesn't seem to work...
collection.find("'${identifier.value}' IN identifier[*].value && '${identifier.system} IN identifier[*].system")
By using the IN operator, what happens underneath the covers is basically a call to JSON_CONTAINS().
So, if you call:
collection.find(":v IN identifier[*].value && :s IN identifier[*].system")
.bind('v', '1')
.bind('s', 'system1')
.execute()
What gets executed, in the end, is (simplified):
JSON_CONTAINS('["1", "2"]', '"2"') AND JSON_CONTAINS('["system1", "system1"]', '"system1"')
In this case, both those conditions are true, and the document will be returned.
The atomic unit is the document (not a slice of that document). So, in your case, regardless of the value of value and/or system, you are still looking for the same document (the one whose _id is '12345'). Using such a statement, the document is either returned if all search values are part of it, and it is not returned if one is not.
For instance, the following would not yield any results:
collection.find(":v IN identifier[*].value && :s IN identifier[*].system")
.bind('v', '1')
.bind('s', 'system2')
.execute()
EDIT: Potential workaround
I don't think using the CRUD API will allow to perform this kind of "cherry-picking", but you can always use SQL. In that case, one strategy that comes to mind is to use JSON_SEARCH() for retrieving an array of paths corresponding to each value in the scope of identifier[*].value and identifier[*].system i.e. the array indexes and use JSON_OVERLAPS() to ensure they are equal.
session.sql(`select * from collection WHERE json_overlaps(json_search(json_extract(doc, '$.identifier[*].value'), 'all', ?), json_search(json_extract(doc, '$.identifier[*].system'), 'all', ?))`)
.bind('2', 'system1')
.execute()
In this case, the result set will only include documents where the identifier array contains at least one JSON object element where value is equal to '2' and system is equal to system1. The filter is effectively applied over individual array items and not in aggregate, like on a basic IN operation.
Disclaimer: I'm the lead developer of the MySQL X DevAPI Connector for Node.js
The long title contain also a mini-exaple because I couldn't explain well what I'm trying to do. Nonethless, the similar questions windows led me to various implementation. But since I read multiple times that it's a bad design, I would like to ask if what I'm trying to do is a bad design rather asking how to do it. For this reason I will try to explain my use case with a minial functional code.
Suppose I have a two classes, each of them with their own parameters:
class MyClass1:
def __init__(self,param1=1,param2=2):
self.param1=param1
self.param2=param2
class MyClass2:
def __init__(self,param3=3,param4=4):
self.param3=param3
self.param4=param4
I want to print param1...param4 as a string (i.e. "param1"..."param4") and not its value (i.e.=1...4).
Why? Two reasons in my case:
I have a GUI where the user is asked to select one of of the class
type (Myclass1, Myclass2) and then it's asked to insert the values
for the parameters of that class. The GUI then must show the
parameter names ("param1", "param2" if MyClass1 was chosen) as a
label with the Entry Widget to get the value. Now, suppose the
number of MyClass and parameter is very high, like 10 classes and 20
parameters per class. In order to minimize the written code and to
make it flexible (add or remove parameters from classes without
modifying the GUI code) I would like to cycle all the parameter of
Myclass and for each of them create the relative widget, thus I need
the paramx names under the form od string. The real application I'm
working on is even more complex, like parameter are inside other
objects of classes, but I used the simpliest example. One solution
would be to define every parameter as an object where
param1.name="param1" and param1.value=1. Thus in the GUI I would
print param1.name. But this lead to a specifi problem of my
implementation, that's reason 2:
MyClass1..MyClassN will be at some point printed in a JSON. The JSON
will be a huge file, and also since it's a complex tree (the example
is simple) I want to make it as simple as possibile. To explain why
I don't like to solution above, suppose this situation:
class MyClass1:
def init(self,param1,param2,combinations=[]):
self.param1=param1
self.param2=param2
self.combinations=combinations
Supposse param1 and param2 are now list of variable size, and
combination is a list where each element is composed by all the
combination of param1 and param2, and generate an output from some
sort of calculation. Each element of the list combinations is an
object SingleCombination,for example (metacode):
param1=[1,2] param2=[5,6] SingleCombination.param1=1
SingleCombination.param2=5 SingleCombination.output=1*5
MyInst1.combinations.append(SingleCombination).
In my case I will further incapsulated param1,param2 in a object
called parameters, so every condition will hace a nice tree with
only two object, parameters and output, and expanding parameters
node will show all the parameters with their value.
If I use JSON pickle to generate a JSON from the situation above, it
is nicely displayed since the name of the node will be the name of
the varaible ("param1", "param2" as strings in the JSON). But if I
do the trick at the end of situation (1), creating an object of
paramN as paramN.name and paramN.value, the JSON tree will become
ugly but especially huge, because if I have a big number of
condition, every paramN contains 2 sub-element. I wrote the
situation and displayed with a JSON Viewer, see the attached immage
I could pre processing the data structure before creating the JSON,
the problem is that I use the JSON to recreate the data structure in
another session of the program, so I need all the pieces of the data
structure to be in the JSON.
So, from my requirements, it seems that the workround to avoid print the variable names creates some side effect on the JSON visualization that I don't know how to solve without changing the logic of my program...
If you use dataclasses, getting the field names is pretty straightforward:
from dataclasses import dataclass, fields
#dataclass
class MyClass1:
first:int = 4
>>> fields(MyClass1)
(Field(name='first',type=<class 'int'>,default=4,...),)
This way, you can iterate over the class fields and ask your user to fill them. Note the field has a type, which you could use to eg ask the user for several values, as in your example.
You could add functions to extract programatically the param names (_show_inputs below ) from the class and values from instances (_json below ):
def blossom(cls):
"""decorate a class with `_json` (classmethod) and `_show_inputs` (bound)"""
def _json(self):
return json.dumps(self, cls=DataClassEncoder)
def _show_inputs(cls):
return {
field.name: field.type.__name__
for field in fields(cls)
}
cls._json = _json
cls._show_inputs = classmethod(_show_inputs)
return cls
NOTE 1: There's actually no need to decorate the classes with blossom. You could just use its internal functions programatically.
Using a custom json encoder to dump the dataclass objects, including properties:
import json
class DataClassPropEncoder(json.JSONEncoder): # https://stackoverflow.com/a/51286749/7814595
def default(self, o):
if is_dataclass(o):
cls = type(o)
# inject instance properties
props = {
name: getattr(o, name)
for name, value in cls.__dict__.items() if isinstance(value, property)
}
return {
**props,
**asdict(o)
}
return super().default(o)
Finally, wrap the computations inside properties so they are
serialized as well when using the decorated class. Full code example:
from dataclasses import asdict
from dataclasses import dataclass
from dataclasses import fields
from dataclasses import is_dataclass
import json
from itertools import product
from typing import List
class DataClassPropEncoder(json.JSONEncoder): # https://stackoverflow.com/a/51286749/7814595
def default(self, o):
if is_dataclass(o):
cls = type(o)
props = {
name: getattr(o, name)
for name, value in cls.__dict__.items() if isinstance(value, property)
}
return {
**props,
**asdict(o)
}
return super().default(o)
def blossom(cls):
def _json(self):
return json.dumps(self, cls=DataClassEncoder)
def _show_inputs(cls):
return {
field.name: field.type.__name__
for field in fields(cls)
}
cls._json = _json
cls._show_inputs = classmethod(_show_inputs)
return cls
#blossom
#dataclass
class MyClass1:
param1:int
param2:int
#blossom
#dataclass
class MyClass2:
param3: List[str]
param4: List[int]
def _compute_single(self, values): # TODO: implmement this
return values[0]*values[1]
#property
def combinations(self):
# TODO: cache if used more than once
# TODO: combinations might explode
field_names = []
field_values = []
cls = type(self)
for field in fields(cls):
field_names.append(field.name)
field_values.append(getattr(self, field.name))
results = []
for values in product(*field_values):
result = {
**{
field_names[idx]: value
for idx, value in enumerate(values)
},
"output": self._compute_single(values)
}
results.append(result)
return results
>>> print(f"MyClass1:\n{MyClass1._show_inputs()}")
MyClass1:
{'param1': 'int', 'param2': 'int'}
>>> print(f"MyClass2:\n{MyClass2._show_inputs()}")
MyClass2:
{'param3': 'List', 'param4': 'List'}
>>> obj_1 = MyClass1(3,4)
>>> print(f"obj_1:\n{obj_1._json()}")
obj_1:
{"param1": 3, "param2": 4}
>>> obj_2 = MyClass2(["first", "second"],[4,2])._json()
>>> print(f"obj_2:\n{obj_2._json()}")
obj_2:
{"combinations": [{"param3": "first", "param4": 4, "output": "firstfirstfirstfirst"}, {"param3": "first", "param4": 2, "output": "firstfirst"}, {"param3": "second", "param4": 4, "output": "secondsecondsecondsecond"}, {"param3": "second", "param4": 2, "output": "secondsecond"}], "param3": ["first", "second"], "param4": [4, 2]}
NOTE 2: If you need to perform several computations per class, it might be a good idea to abstract away the pattern in the combinations property to avoid repeating code.
NOTE 3: If you need access to the properties several times and not ust once, you might want to consider caching their values to avoid re-computation.
Once you have an instance of MyClass / MyClass2, you can call vars() or vars().keys() and it will give you the attributes as a str. Unlike dir, it will not show all the builtin attributes/methods starting with __.
class MyClass2:
def __init__(self,param3=3,param4=4):
self.param3=param3
self.param4=param4
instance_of_myclass2 = MyClass2(param3="what", param4="ever")
print(vars(instance_of_myclass2))
{'param3': 'what', 'param4': 'ever'}
print(vars(instance_of_myclass2).keys())
dict_keys(['param3', 'param4'])
dir(instance_of_myclass2)
['__class__', '__delattr__', '__dict__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__gt__', '__hash__', '__init__', '__init_subclass__', '__le__', '__lt__', '__module__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', '__weakref__', 'param3', 'param4']
Is it possible to write a hash function in such a way that hash of a string is equal to the hash of the same string but reversed? e.g.
hash("life") == hash("efil")
hash("life") == hash("life")
It would most likely defeat the purpose of it being a good uniform hash function, but sure, you can define a function like that. Anything that maps data of arbitrary size to data of fixed size is a hash function.
You could e.g. sort the characters in the input before passing the string to some other hash function, example in Python:
def myhash(s):
return hash(''.join(sorted(s)))
myhash("life") == myhash("efil")
returns True.
Another option, as sketched before in the comments:
def myhash2(s):
return hash(''.join(sorted([s,''.join(reversed(s))])))
from itertools import permutations
for r in sorted(((myhash2(s),s) for s in (''.join(p) for p in permutations('life')))):
print r
returns
(-8040601677570703892, 'efil')
(-8040601677570703892, 'life')
(-7809799233501675124, 'feli')
(-7809799233501675124, 'ilef')
(-7087163641342373728, 'feil')
(-7087163641342373728, 'lief')
(-3309334617147859636, 'ifel')
(-3309334617147859636, 'lefi')
(-1342443318618769016, 'flei')
(-1342443318618769016, 'ielf')
(-1101501415991241164, 'elif')
(-1101501415991241164, 'file')
(-94298563527211992, 'elfi')
(-94298563527211992, 'ifle')
(1475311783909651192, 'efli')
(1475311783909651192, 'ilfe')
(1616670693919962800, 'eifl')
(1616670693919962800, 'lfie')
(1783747732969301136, 'iefl')
(1783747732969301136, 'lfei')
(1880557847288313568, 'fiel')
(1880557847288313568, 'leif')
(4933084291912123320, 'eilf')
(4933084291912123320, 'flie')
I still like the comment/solution of #deceze best...
I store a hash to the mysql,but the result make me confuse:
hash:
{:unique_id=>35, :description=>nil, :title=>{"all"=>"test", "en"=>"test"}...}
and I use the serialize in my model.
serialize :title
The result in mysql like this:
--- !ruby/hash:ActiveSupport::HashWithIndifferentAccess
all: test
en: test
Anyone can tell me what is the meaning? Why there is a ruby/hash:ActiveSupport::HashWithIndifferentAccess in mysql?
TL;DR:
serialize :title, Hash
What’s happening here is that serialize internally will yaml-dump the class instance. And the hashes in rails are monkeypatched to may having an indifferent access. The latter means, that you are free to use both strings and respective symbols as it’s keys:
h = { 'a' => 42 }.with_indifferent_access
puts h[:a]
#⇒ 42
Hash needs to be serialized, default serializer is YAML which supports in some way, in the Ruby implementation, the storing of type. Your hash is of type ActiveSupport::HashWithIndifferentAccess, so when fetched back, ruby knows what object should get back (unserializes it to an HashWithIndifferentAccess).
Notice that HashWithIndifferentAccess is a hash where you can access values by using either strings or symbols, so:
tmp = { foo: 'bar' }.with_indifferent_access
tmp[:foo] # => 'bar'
tmp['foo'] # => 'bar'
With a normal hash (Hash.new), you would get instead:
tmp = { foo: 'bar' }
tmp[:foo] # => 'bar'
tmp['foo'] # => nil
Obviously, in mysql, the hash is stored as a simple string, (YAML is plain text with a convention)