Neo4j JSON APOC load - skip nulls - json

I'm trying to load some JSON from a REST API (using Neo4j 3.0.4 & APOC apoc-3.0.4.1-all) that has null values in it. This is throwing up this error:
"Cannot merge node using null property value"
The nulls can be spread across multiple keys and it varies which keys have null values. Hence I'd prefer to avoid specifying which individual keys to handle nulls for if possible.
I found the apoc.map.clean(map,[keys],[values]) procedure but not much info on how to use it. Is this the best procedure to use this for every key or is there an simpler way?
Thanks!

Thanks stdob - I managed to find another post you had written which helped me to understand solution. I need to substitute the first property for one that was never null.
MERGE (label:Label{key2: json.key2}) ON CREATE
SET label.key3 = json.key3, label.key1 = json.key1

Related

How do you update a SQLAlchemy RowProxy?

I'm working with the SQLAlchemy Expression Language (not the ORM), and I'm trying to figure out how to update a query result.
I've discovered that RowProxy objects don't support assignment, throwing an AttributeError instead:
# Get a row from the table
row = engine.execute(mytable.select().limit(1)).fetchone()
# Check that `foo` exists on the row
assert row.foo is None
# Try to update `foo`
row.foo = "bar"
AttributeError: 'RowProxy' object has no attribute 'foo'
I've found this solution, which makes use of the ORM, but I'm specifically looking to use the Expression Language.
I've also found this solution, which converts the row to a dict and updates the dict, but that seems like a hacky workaround.
So I have a few questions:
Is this in fact the only way to do it?
Moreover, is this the recommended way to do it?
And lastly, the lack of documentation made me wonder: am I just misusing SQLAlchemy by trying to do this?
You are misusing SQLAlchemy. The usage you've described is the benefit of using an ORM. If you only want to restrict yourself to SQLAlchemy Core, then you need to do
engine.execute(mytable.update().where(mytable.c.id == <id>).values(foo="bar"))

SAS pass through - Extract from MySQL does not work

I'm trying to build a Data Integration job uses pass through to extract data from a view in a MySQL database.
Wev'e been using pass through a lot in the project, mostly extracting data from Redshift,
however with MySQL I was not able to do make it work properly.
It keeps complaining a table is missing even though when pass through is off, view is found and data is extracted...
tried every trick I know, starting from enabling case-sensitive DBMS object names, to manually remove single/double quotes from the statement just in case MySQL confuses confuses it with something else...
No luck.
ODBC driver is [MySQL][ODBC 5.3(a) Driver][mysqld-5.5.53].
Ran on a Windows environment.
Any idea how to solve this?
Thank you in advance.
EDIT
So, first of all, one correction (even though not that important - I extract from a view, not a table).
This is the code generated by SAS Create Table transformation, pass through enabled. I only put an asterisk instead of the full list of columns:
proc sql;
connect to ODBC
(
READBUFF=10000 DATASRC="cmp.web_api" AUTHDOMAIN="MYSQL_CMP_Auth"
);
create table work."W7ZZZKOC"n as
select
*
from connection to ODBC
(
select
V_BI_ACCOUNT.ACCOUNT_NAME,
V_BI_ACCOUNT.ACQUISITION_SOURCE__C,
V_BI_ACCOUNT.ZUORA__ACTIVE__C,
V_BI_ACCOUNT.ADDRESS_LINE_1__C,
V_BI_ACCOUNT.ADDRESS_LINE_2__C,
V_BI_ACCOUNT.ADDRESS_LINE_3__C,
V_BI_ACCOUNT.AGREEMENT_DATE,
V_BI_ACCOUNT.AGREEMENT_LEGAL_CLAUSE_1__C,
V_BI_ACCOUNT.AGREEMENT_LEGAL_CLAUSE_2__C,
V_BI_ACCOUNT.PERSONBIRTHDATE,
V_BI_ACCOUNT.BLOCKED_REASON__C,
V_BI_ACCOUNT.BRAND__C,
V_BI_ACCOUNT.CPN__C,
V_BI_ACCOUNT.ACCCREATEDBYID,
V_BI_ACCOUNT.ACCCREATEDDATE,
V_BI_ACCOUNT.CURRENCY_PREFERENCE__C,
V_BI_ACCOUNT.CUSTOMER_FULL_NAME__PC,
V_BI_ACCOUNT.ACCOUNTID,
V_BI_ACCOUNT.ZUORA__CUSTOMERPRIORITY__C,
V_BI_ACCOUNT.DELIVERY_SALUTATION__C,
V_BI_ACCOUNT.DISPLAY_NAME,
V_BI_ACCOUNT.PERSONEMAIL,
V_BI_ACCOUNT.EMAILKEY__C,
V_BI_ACCOUNT.FACEBOOKKEY,
V_BI_ACCOUNT.FIRSTNAME,
V_BI_ACCOUNT.GENDER__C,
V_BI_ACCOUNT.PHONE,
V_BI_ACCOUNT.ACCLASTACTIVITYDATE,
V_BI_ACCOUNT.ACCLASTMODIFIEDDATE,
V_BI_ACCOUNT.LASTNAME,
V_BI_ACCOUNT.OTHER_EMAIL__C,
V_BI_ACCOUNT.PI_TYPE__C,
V_BI_ACCOUNT.ACCPARENTID,
V_BI_ACCOUNT.POSTCODE__C,
V_BI_ACCOUNT.PRIMARY_ACCOUNT_OF_THIS_CUSTOMER,
V_BI_ACCOUNT.ACCPRIMARY__C,
V_BI_ACCOUNT.ACCREASON_FOR_STATUS__C,
V_BI_ACCOUNT.ZUORA__SLA__C,
V_BI_ACCOUNT.ZUORA__SLASERIALNUMBER__C,
V_BI_ACCOUNT.SALUTATION,
V_BI_ACCOUNT.ACCSYSTEMMODSTAMP,
V_BI_ACCOUNT.PERSONTITLE,
V_BI_ACCOUNT.ZUORA__UPSELLOPPORTUNITY__C,
V_BI_ACCOUNT.X_CODE__C,
V_BI_ACCOUNT.ZUORA__ACCOUNT_ID__C,
V_BI_ACCOUNT.ZUORA__PAYMENTMETHODID__C,
V_BI_ACCOUNT.CITY,
V_BI_ACCOUNT.ORIGINAL_CREATED_DATE,
V_BI_ACCOUNT.SOURCE_SYSTEM_ID,
V_BI_ACCOUNT.STATUS,
V_BI_ACCOUNT.ZUORA__CONTACT_ID,
V_BI_ACCOUNT.ACCISDELETED,
V_BI_ACCOUNT.BILLING_ACCOUNT_NAME,
V_BI_ACCOUNT.ACZCREATEDDATE,
V_BI_ACCOUNT.ACZSYSTEMMODSTAMP,
V_BI_ACCOUNT.ACZLASTACTIVITYDATE,
V_BI_ACCOUNT.ZUORA__ACCOUNT__C,
V_BI_ACCOUNT.ZUORA__ACCOUNTNUMBER__C,
V_BI_ACCOUNT.ZUORA__AUTOPAY__C,
V_BI_ACCOUNT.ZUORA__BALANCE__C,
V_BI_ACCOUNT.ZUORA__CREDITCARDEXPIRATION__C,
V_BI_ACCOUNT.ZUORA__CURRENCY__C,
V_BI_ACCOUNT.ZUORA__MRR__C,
V_BI_ACCOUNT.ZUORA__PAYMENTTERM__C,
V_BI_ACCOUNT.ZUORA__PURCHASEORDERNUMBER__C,
V_BI_ACCOUNT.ZUORA__LASTINVOICEDATE__C,
V_BI_ACCOUNT.COUNTRY_NAME,
V_BI_ACCOUNT.COUNTRY_CODE,
V_BI_ACCOUNT.FAVOURITE_FOOTBALL_CLUB,
V_BI_ACCOUNT.COUNTY
from
web_api.V_BI_ACCOUNT as V_BI_ACCOUNT
);
%rcSet(&sqlrc);
disconnect from ODBC;
quit;
And again, when I extract data without pass through - works successfully,
I found out the problem was a column name exceeds 32 positions.
As SAS supports up column names up to 32,
the query fails to find PRIMARY_ACCOUNT_OF_THIS_CUSTOMER as the original column name is PRIMARY_ACCOUNT_OF_THIS_CUSTOMER__C.
EDIT
One more thing I found out is, MySQL doesn't like specifying schema name nor aliases.
Therefore,
From clause to only specify table name i.e : 'from v_bi_account' rather than 'web_api.v_bi_account'
and do not use aliases i.e use 'from v_bi_account' rather than 'from v_bi_account as v_bi_account'
Thank you guys so much for your help.

IllegalStateException while trying create NativeQuery with EntityManager

I have been getting this annoying exception while trying to create a native query with my entity manager. The full error message is:
java.lang.IllegalStateException: During synchronization a new object was found through a relationship that was not marked cascade PERSIST: com.model.OneToManyEntity2#61f3b3b.
at org.eclipse.persistence.internal.sessions.RepeatableWriteUnitOfWork.discoverUnregisteredNewObjects(RepeatableWriteUnitOfWork.java:313)
at org.eclipse.persistence.internal.sessions.UnitOfWorkImpl.calculateChanges(UnitOfWorkImpl.java:723)
at org.eclipse.persistence.internal.sessions.RepeatableWriteUnitOfWork.writeChanges(RepeatableWriteUnitOfWork.java:441)
at org.eclipse.persistence.internal.jpa.EntityManagerImpl.flush(EntityManagerImpl.java:874)
at org.eclipse.persistence.internal.jpa.QueryImpl.performPreQueryFlush(QueryImpl.java:967)
at org.eclipse.persistence.internal.jpa.QueryImpl.executeReadQuery(QueryImpl.java:207)
at org.eclipse.persistence.internal.jpa.QueryImpl.getSingleResult(QueryImpl.java:521)
at org.eclipse.persistence.internal.jpa.EJBQueryImpl.getSingleResult(EJBQueryImpl.java:400)
The actual code that triggers the error is:
Query query;
query = entityManager.createNativeQuery(
"SELECT MAX(CAST(SUBSTRING_INDEX(RecordID,'-',-1) as Decimal)) FROM `QueriedEntityTable`");
String recordID = (query.getSingleResult() == null ?
null :
query.getSingleResult()
.toString());
This is being executed with an EntityTransaction in the doTransaction part. The part that is getting me with this though is that this is the first code to be executed within the doTransaction method, simplified below to:
updateOneToManyEntity1();
updateOneToManyEntity2();
entityManager.merge(parentEntity);
The entity it has a problem with "OneToManyEntity1" isn't even the table I'm trying to create the query on. I'm not doing any persist or merge up until this point either, so I'm also not sure what is supposedly causing it to be out of sync. The only database work that's being done up until this code is executed is just pulling in data, not changing anything. The foreign keys are properly set up in the database.
I'm able to get rid of this error by doing as it says and marking these relationships as Cascade.PERSIST, but then I get a MySQLContrainstraViolationException on the query.getSingleResult() line. My logs show that its doing some INSERT queries right before this, so it looks like its reaching the EntityManager.merge part of my doTransaction method, but the error and call stack point to a completely different part of the code.
Using EclipseLink (2.6.1), Glassfish 4, and MySQL. The entitymanager is using RESOURCE_LOCAL with all the necessary classes listed under the persistence-unit tag and exclude-unlisted-classes is set to false.
Edit: So some more info as I'm trying to work through this. If I put a breakpoint at the beginning of the transaction and then execute entityManager.clear() through IntelliJ's "Evaluate Expression" tool, everything works fine at least the first time through. Without it, I get an error as it tries to insert empty objects into the table.
Edit #2: I converted the nativeQuery part into using the Criteria API and this let me actually make it through my code so I could find where it was unintentionally adding in a null object to my entity list. I'm still just confused as to why the entity manager is caching these errors or something to the point that creating a native query is breaking because its still trying to insert bad data. Is this something I'd need to call EntityManager.clear() before doing each time? Or am I supposed to call this when there is an error in the doTransaction method?
So after reworking the code and setting this aside, I stumbled on at least part of the answer to my question. My issue was caused by the object being persisted prior to the transaction starting. So when I was entering my transaction, it first tried to insert/update data from my entity objects and threw an error since I hadn't set the values of most of the non-null columns. I believe this is the reason I was getting the cascade errors and I'm positive this is the source of the random insert queries I saw being fired off at the beginning of my transaction. Hope this helps someone else avoid a lot of trouble.

Multiple, unknown number of fields passed into a query

Is it possible to create a generic query that would work for different types of documents? For example I have "cases" and "factories",
They have different set of fields. e.g:
{
id: 'case_o1',
name: 'Case numero uno',
amount: 40
}
{
id: 'factory_002',
location: 'Venezuela',
workers: 200,
operating: true
}
Is it possible to create a generic query where I would pass the type of an entity (case or factory) and additional parameters and it would filter results based on those?
I could of course use javascript view, but it doesn't allow me to filter by multiple fields. Let's say I want to fetch all factories located in Venezuela, with number of workers between 20 and 55.
I started with this, but then I got stuck:
select * from `mybucket` as entity
where position(meta(entity).id, $entity_type) == 0
How do I pass multiple predicates and have the query to recognize them?
I can of course list fields like this:
where position(meta(entity).id, $entity_type) == 0
and entity.location == 'Venezuela'
and entity.workers > $workers_min
and entity.workers < $workers_max
but then
I'm gonna have to create a separate query for each entity
And even then it won't solve my problem - I have no idea how to ignore predicates, what if next time $workers_min and $workers_max are not passed, does it mean I have to create a query for every single predicate (column)?
For security reasons I cannot generate free-form queries and pass them to Couchbase server, all the queries are already stored in the database, our api just picks them up out of a document and executes them
I think it's possible to create a query that would be "short-circuiting" for args that's undefined (e.g. WHERE $location IS MISSING OR entity.location == $location or something like that)
Is it possible at all to create a query that would be able to effectively filter and order a dataset based on arbitrary parameters? Or there's no way?
#Agzam. Sorry. I were writting my comment when you said it. But anyway. What you are asking for is possible by using coalesces in a not too complex expressions, but it is a REALLY bad idea because this will drastically throw down most of internal database optimizations. Including the use of any existing index. So, except if you are dealing with a relatively small database (and you are sure it will remain being approximately the same size), I suggest you to better try distinct approach… This is, in fact, the reason I implmented sqlapi.
If you need to have all querys previously stored in database, it probably could be much better to sort given arguments by its name and precalculate and store precalculated querys for each possible combination.
You can do it by assigning a default value to the variable when is not used. For instance if $location is not used you can set it to -1 as default value.
Then the where condition would be:
WHERE ($location=-1 OR entity.location = $location)

Delete entry in couchbase bucket using key in the form of regex

I have a requirement wherein I have to delete an entry from the couchbase bucket. I use the delete method of the CouchbaseCient from my java application to which I pass the key. But in one particular case I dont have the entire key name but a part of it. So I thought that there would be a method that takes a matcher but I could not find one. Following is the actual key that is stored in the bucket
123_xyz_havefun
and the part of the key that I have is xyz. I am not sure whether this can be done. Can anyone help.
The DELETE operation of the Couchbase doesn't support neither wildcards, nor regular expressions. So you have to get the list of keys somehow and pass it to the function. For example, you might use Couchbase Views or maintain your own list of keys via APPEND command. Like create the key xyz and append to its value all the matching keys during application lifetime with flushing this key after real delete request
Well, I think you can achieve delete using wildcard or regex like expression.
Above answers basically says,
- Query the data from the Couchbase
- Iterate over resultset
- and fire delete for each key of your interest.
However, I believe: Delete on server should be delete on server, rather than requiring three steps as above.
In this regards, I think old fashioned RDBMS were better all you need to do is fire SQL query like 'DELETE * from database where something like "match%"'.
Fortunately, there is something similar to SQL is available in CouchBase called N1QL (pronounced nickle). I am not aware about JavaScript (and other language syntax) but this is how I did it in python.
Query to be used: DELETE from b where META(b).id LIKE "%"
layer_name_prefix = cb_layer_key + "|" + "%"
query = ""
try:
query = N1QLQuery('DELETE from `test-feature` b where META(b).id LIKE $1', layer_name_prefix)
cb.n1ql_query(query).execute()
except CouchbaseError, e:
logger.exception(e)
To achieve the same thing: alternate query could be as below if you are storing 'type' and/or other meta data like 'parent_id'.
DELETE from where type='Feature' and parent_id=8;
But I prefer to use first version of the query as it operates on key, and I believe Couchbase must have some internal indexes to operate/query faster on key (and other metadata).
Although it is true you cannot iterate over documents with a regex, you could create a new view and have your map function only emit keys that match your regex.
An (obviously contrived and awful regex) example map function could be:
function(doc, meta) {
if (meta.id.match(/_xyz_/)) {
emit(meta.id, null);
}
}
An alternative idea would be to extract that portion of the key from each document and then emit that. That would allow you to use the same index to match different documents by that particular key form.
function(doc, meta) {
var match = meta.id.match(/^.*_(...)_.*$/);
if (match) {
emit(match[1], null);
}
}
In your case, this would emit the key xyz (or the corresponding component from each key) for each document. You could then just use startkey and endkey to limit based on your criteria.
Lastly, there are a ton of options from the information retrieval research space for building text indexes that could apply here. I'll refer you to this doc on permuterm indexes to get you started.