I am having issue with QueryDSL array comparing. The issue is comparing on byte arrays is missing in QueryDSL because ArrayPath extends SimpleExpression which only has methods like eq, ne, in, isNull...
Using Kotlin, Hibernate 5.2, QueryDSL 4.1.4, MySQL 5.7
Having Hibernate ebmeddable object IpAddress
#Embeddable
class IpAddress(
var ipAddressBinary: ByteArray
)
which in DB is mapped to type VARBINARY(4) (for IPv4 addresses, IPv6 support planned in future). I store there ip address in binary form which I need for searching in range.
I wanted to create kotlin extension function to support goe and loe functions on my embedded object
So far i came up with this:
fun ArrayPath<ByteArray, Byte>.goe(ipBytes: ByteArray): BooleanExpression {
return Expressions.booleanOperation(Ops.GOE, ConstantImpl.create(ipBytes))
}
fun QIpAddress.goe(ipAddressStr: String): BooleanExpression {
val ipBytes = MapperUtils.IPv4.normalizedToBytes(ipAddressStr)
return this.ipAddressBinary.goe(ipBytes)
}
but calling
query.where(QBlackListRecord.blackListRecord.startAddress.goe(someByteArrayRepresentingIpAddress))
results in an exception:
[exec-10]java.lang.IndexOutOfBoundsException: index (1) must be less than size (1)
at com.google.common.base.Preconditions.checkElementIndex(Preconditions.java:310)
at com.google.common.base.Preconditions.checkElementIndex(Preconditions.java:292)
at com.google.common.collect.SingletonImmutableList.get(SingletonImmutableList.java:45)
at com.querydsl.core.types.Template$ByIndex.convert(Template.java:190)
at com.querydsl.core.support.SerializerBase.visitOperation(SerializerBase.java:256)
at com.querydsl.jpa.JPQLSerializer.visitOperation(JPQLSerializer.java:426)
at com.querydsl.core.support.SerializerBase.visit(SerializerBase.java:231)
at com.querydsl.core.support.SerializerBase.visit(SerializerBase.java:31)
at com.querydsl.core.types.OperationImpl.accept(OperationImpl.java:83)
at com.querydsl.core.support.SerializerBase.handle(SerializerBase.java:92)
at com.querydsl.core.support.SerializerBase.visitOperation(SerializerBase.java:267)
at com.querydsl.jpa.JPQLSerializer.visitOperation(JPQLSerializer.java:437)
at com.querydsl.core.support.SerializerBase.visit(SerializerBase.java:231)
at com.querydsl.core.support.SerializerBase.visit(SerializerBase.java:31)
at com.querydsl.core.types.OperationImpl.accept(OperationImpl.java:83)
at com.querydsl.core.support.SerializerBase.handle(SerializerBase.java:92)
at com.querydsl.jpa.JPQLSerializer.serialize(JPQLSerializer.java:220)
at com.querydsl.jpa.JPAQueryBase.serialize(JPAQueryBase.java:60)
at com.querydsl.jpa.JPAQueryBase.serialize(JPAQueryBase.java:50)
at com.querydsl.jpa.impl.AbstractJPAQuery.createQuery(AbstractJPAQuery.java:98)
at com.querydsl.jpa.impl.AbstractJPAQuery.createQuery(AbstractJPAQuery.java:94)
at com.querydsl.jpa.impl.AbstractJPAQuery.fetch(AbstractJPAQuery.java:201)
at org.springframework.data.jpa.repository.support.QuerydslJpaRepository.findAll(QuerydslJpaRepository.java:164)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.springframework.data.repository.core.support.RepositoryComposition$RepositoryFragments.invoke(RepositoryComposition.java:377)
at org.springframework.data.repository.core.support.RepositoryComposition.invoke(RepositoryComposition.java:200)
at org.springframework.data.repository.core.support.RepositoryFactorySupport$ImplementationMethodExecutionInterceptor.invoke(RepositoryFactorySupport.java:629)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:185)
at org.springframework.data.repository.core.support.RepositoryFactorySupport$QueryExecutorMethodInterceptor.doInvoke(RepositoryFactorySupport.java:593)
at org.springframework.data.repository.core.support.RepositoryFactorySupport$QueryExecutorMethodInterceptor.invoke(RepositoryFactorySupport.java:578)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:185)
at org.springframework.data.projection.DefaultMethodInvokingMethodInterceptor.invoke(DefaultMethodInvokingMethodInterceptor.java:59)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:185)
at org.springframework.transaction.interceptor.TransactionAspectSupport.invokeWithinTransaction(TransactionAspectSupport.java:294)
at org.springframework.transaction.interceptor.TransactionInterceptor.invoke(TransactionInterceptor.java:98)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:185)
at org.springframework.dao.support.PersistenceExceptionTranslationInterceptor.invoke(PersistenceExceptionTranslationInterceptor.java:139)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:185)
at org.springframework.data.jpa.repository.support.CrudMethodMetadataPostProcessor$CrudMethodMetadataPopulatingMethodInterceptor.invoke(CrudMethodMetadataPostProcessor.java:135)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:185)
at org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke(ExposeInvocationInterceptor.java:92)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:185)
at org.springframework.data.repository.core.support.SurroundingTransactionDetectorMethodInterceptor.invoke(SurroundingTransactionDetectorMethodInterceptor.java:61)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:185)
at org.springframework.data.repository.core.support.MethodInvocationValidator.invoke(MethodInvocationValidator.java:99)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:185)
at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:212)
at com.sun.proxy.$Proxy144.findAll(Unknown Source)
I am not sure whether my approach to the matter is correct. Is there any other way to compare byte arrays in QueryDSL which produces simple SQL with simple comparison operators?
All I need is to only produce SQL like this
select *
from black_list_record
where start_ip_address_binary >= #binary_value
or like this
select *
from black_list_record
where start_ip_address_binary >= inet6_aton('10.20.0.1')
The error was caused by function
fun ArrayPath<ByteArray, Byte>.goe(ipBytes: ByteArray): BooleanExpression {
return Expressions.booleanOperation(Ops.GOE, ConstantImpl.create(ipBytes))
}
which should look like this
fun ArrayPath<ByteArray, Byte>.goe(ipBytes: ByteArray): BooleanExpression {
return Expressions.booleanOperation(Ops.GOE, this, ConstantImpl.create(ipBytes))
}
I did not pass "this" argument for operation GOE which expects two arguments because its template is {0} >= {1}. That is all...
If you dont use Kotlin but Java you can use QueryDSL build-in method delegating - http://www.querydsl.com/static/querydsl/4.2.1/reference/html_single/#d0e2474
Related
I have a spark job that reads from AWS Aurora Mysql database. Unfortunately, the job keeps failing throwing an exception because of an invalid datetime in one of the records.
Sample code:
val jdbcUrl =
s"jdbc:mysql://$dbHostname:$dbPort/$dbName?zeroDateTimeBehavior=convertToNull&serverTimezone=UTC"
val props = reportConf.connectionProps(db)
val df = spark.read
.format("jdbc")
.option("url", jdbcUrl)
.option("dbtable",
s"(SELECT *, MOD($partitionColumn,10) AS partition_key FROM $table ORDER BY $partitionColumn) as $table") // DESC LIMIT 50000
.option("user", user)
.option("password", password)
.option("driver", driver)
.option("numPartitions", numPartitions)
.option("partitionColumn", "partition_key")
.option("lowerBound", 0)
.option("upperBound", 9)
.option("mode", "DROPMALFORMED")
.load()
.drop('partition_key)
I already tried converting the zeros date values to Null - zeroDateTimeBehavior=convertToNull in my jdbcUrl but not working.
Preferably, I would like to skip the records or replace with some default values to filter later on instead of manually identifying the bad records in the database tables.
Any idea how to fix this in spark please ?
Exception:
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:678)
Caused by: java.lang.IllegalArgumentException: DAY_OF_MONTH
at java.util.GregorianCalendar.computeTime(GregorianCalendar.java:2648)
at java.util.Calendar.updateTime(Calendar.java:3393)
at java.util.Calendar.getTimeInMillis(Calendar.java:1782)
at com.mysql.cj.jdbc.io.JdbcDateValueFactory.createFromDate(JdbcDateValueFactory.java:67)
at com.mysql.cj.jdbc.io.JdbcDateValueFactory.createFromDate(JdbcDateValueFactory.java:39)
at com.mysql.cj.core.io.ZeroDateTimeToNullValueFactory.createFromDate(ZeroDateTimeToNullValueFactory.java:41)
at com.mysql.cj.core.io.BaseDecoratingValueFactory.createFromDate(BaseDecoratingValueFactory.java:46)
at com.mysql.cj.core.io.BaseDecoratingValueFactory.createFromDate(BaseDecoratingValueFactory.java:46)
at com.mysql.cj.core.io.MysqlTextValueDecoder.decodeDate(MysqlTextValueDecoder.java:66)
at com.mysql.cj.mysqla.result.AbstractResultsetRow.decodeAndCreateReturnValue(AbstractResultsetRow.java:70)
at com.mysql.cj.mysqla.result.AbstractResultsetRow.getValueFromBytes(AbstractResultsetRow.java:225)
at com.mysql.cj.mysqla.result.TextBufferRow.getValue(TextBufferRow.java:122)
at com.mysql.cj.jdbc.result.ResultSetImpl.getNonStringValueFromRow(ResultSetImpl.java:630)
at com.mysql.cj.jdbc.result.ResultSetImpl.getDateOrTimestampValueFromRow(ResultSetImpl.java:643)
at com.mysql.cj.jdbc.result.ResultSetImpl.getDate(ResultSetImpl.java:788)
at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anonfun$org$apache$spark$sql$execution$datasources$jdbc$JdbcUtils$$makeGetter$2.apply(JdbcUtils.scala:389)
at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anonfun$org$apache$spark$sql$execution$datasources$jdbc$JdbcUtils$$makeGetter$2.apply(JdbcUtils.scala:387)
at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anon$1.getNext(JdbcUtils.scala:356)
at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anon$1.getNext(JdbcUtils.scala:338)
at org.apache.spark.util.NextIterator.hasNext(NextIterator.scala:73)
at org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:37)
at org.apache.spark.util.CompletionIterator.hasNex
I am getting an error while using a dataweave transformation on a JSON Payload. The JSON Payload is
{
"requestId": "13431#1638a2abfb8",
"result": [
{
"batchId": 1028,
"importId": "1028",
"status": "Queued"
}
],
"success": true
}
The above payload is returned by a RESTful service and I have converted that to a object using byteArray to Object transformer before applying the following dataweave transformation
%dw 1.0
%output application/json
---
batchexecution:
{
batchid:payload.result[0].batchid,
status: payload.result[0].status,
success:payload.success
} when ((payload.result != null) and (sizeOf payload.result > 0))
otherwise
{
batchid: 0,
status:"Not queued",
success:false
}
I am expecting only one record for the result object and I have a check to see whether the array is null or its size is >0. I get the following error when I execute the transformation code. Not sure what is wrong here.
I am expecting the following output for the transformation but I am getting the error while executing the transformation code
{
"batchexecution": {
"batchId": 1028,
"status": "Queued",
"success": true
}
}
But I am getting the following error as You cannot compare a value of type ::array.
Message : Exception while executing:
{"requestId":"64b3#1638e55058c","result":[{"batchId":1037,"importId":"1037","status":"Queued"}],"success":true}
^
You cannot compare a value of type ::array.
Payload : {"requestId":"64b3#1638e55058c","result":[{"batchId":1037,"importId":"1037","status":"Queued"}],"success":true}
Payload Type : java.lang.String
Element : /marketing-dbmkt-etl-marketoFlow/processors/8 # marketing-dbmkt-etl-marketo:marketing-dbmkt-etl-marketo.xml:69 (Transform Message)
Element XML : <dw:transform-message doc:name="Transform Message" metadata:id="90448cfd-5884-441a-a989-e32e4877ac24">
<dw:input-payload mimeType="application/json" doc:sample="sample_data\batchreturnObject.dwl"></dw:input-payload>
<dw:set-payload>%dw 1.0%output application/json---batchexecution:{batchid:payload.result[0].batchid,status: payload.result[0].status,success:payload.success} when ((payload.result != null) and (sizeOf payload.result > 0))otherwise{batchid: 0,status:"Not queued",success:false}</dw:set-payload>
</dw:transform-message>
--------------------------------------------------------------------------------
Root Exception stack trace:
com.mulesoft.weave.mule.exception.WeaveExecutionException: Exception while executing:
{"requestId":"64b3#1638e55058c","result":[{"batchId":1037,"importId":"1037","status":"Queued"}],"success":true}
^
You cannot compare a value of type ::array.
at com.mulesoft.weave.mule.exception.WeaveExecutionException$.apply(WeaveExecutionException.scala:10)
at com.mulesoft.weave.mule.WeaveMessageProcessor.execute(WeaveMessageProcessor.scala:121)
at com.mulesoft.weave.mule.WeaveMessageProcessor.process(WeaveMessageProcessor.scala:67)
at org.mule.execution.ExceptionToMessagingExceptionExecutionInterceptor.execute(ExceptionToMessagingExceptionExecutionInterceptor.java:27)
at org.mule.execution.MessageProcessorNotificationExecutionInterceptor.execute(MessageProcessorNotificationExecutionInterceptor.java:108)
at org.mule.execution.MessageProcessorExecutionTemplate.execute(MessageProcessorExecutionTemplate.java:44)
at org.mule.processor.BlockingProcessorExecutor.executeNext(BlockingProcessorExecutor.java:88)
at org.mule.processor.BlockingProcessorExecutor.execute(BlockingProcessorExecutor.java:59)
at org.mule.execution.ExceptionToMessagingExceptionExecutionInterceptor.execute(ExceptionToMessagingExceptionExecutionInterceptor.java:27)
at org.mule.execution.MessageProcessorExecutionTemplate.execute(MessageProcessorExecutionTemplate.java:44)
at org.mule.processor.BlockingProcessorExecutor.executeNext(BlockingProcessorExecutor.java:98)
at org.mule.processor.BlockingProcessorExecutor.execute(BlockingProcessorExecutor.java:59)
at org.mule.interceptor.AbstractEnvelopeInterceptor.processBlocking(AbstractEnvelopeInterceptor.java:58)
at org.mule.processor.AbstractRequestResponseMessageProcessor.process(AbstractRequestResponseMessageProcessor.java:47)
at org.mule.processor.AsyncInterceptingMessageProcessor.processNextTimed(AsyncInterceptingMessageProcessor.java:129)
at org.mule.processor.AsyncInterceptingMessageProcessor$AsyncMessageProcessorWorker$1.process(AsyncInterceptingMessageProcessor.java:213)
at org.mule.processor.AsyncInterceptingMessageProcessor$AsyncMessageProcessorWorker$1.process(AsyncInterceptingMessageProcessor.java:206)
at org.mule.execution.ExecuteCallbackInterceptor.execute(ExecuteCallbackInterceptor.java:16)
at org.mule.execution.CommitTransactionInterceptor.execute(CommitTransactionInterceptor.java:35)
at org.mule.execution.CommitTransactionInterceptor.execute(CommitTransactionInterceptor.java:22)
at org.mule.execution.HandleExceptionInterceptor.execute(HandleExceptionInterceptor.java:30)
at org.mule.execution.HandleExceptionInterceptor.execute(HandleExceptionInterceptor.java:14)
at org.mule.execution.BeginAndResolveTransactionInterceptor.execute(BeginAndResolveTransactionInterceptor.java:67)
at org.mule.execution.ResolvePreviousTransactionInterceptor.execute(ResolvePreviousTransactionInterceptor.java:44)
at org.mule.execution.SuspendXaTransactionInterceptor.execute(SuspendXaTransactionInterceptor.java:50)
at org.mule.execution.ValidateTransactionalStateInterceptor.execute(ValidateTransactionalStateInterceptor.java:40)
at org.mule.execution.IsolateCurrentTransactionInterceptor.execute(IsolateCurrentTransactionInterceptor.java:41)
at org.mule.execution.ExternalTransactionInterceptor.execute(ExternalTransactionInterceptor.java:48)
at org.mule.execution.RethrowExceptionInterceptor.execute(RethrowExceptionInterceptor.java:28)
at org.mule.execution.RethrowExceptionInterceptor.execute(RethrowExceptionInterceptor.java:13)
at org.mule.execution.TransactionalErrorHandlingExecutionTemplate.execute(TransactionalErrorHandlingExecutionTemplate.java:110)
at org.mule.execution.TransactionalErrorHandlingExecutionTemplate.execute(TransactionalErrorHandlingExecutionTemplate.java:30)
at org.mule.processor.AsyncInterceptingMessageProcessor$AsyncMessageProcessorWorker.doRun(AsyncInterceptingMessageProcessor.java:205)
at org.mule.work.AbstractMuleEventWork.run(AbstractMuleEventWork.java:53)
at org.mule.work.WorkerContext.run(WorkerContext.java:301)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:748)
********************************************************************************
The problem here is not obvious, but I have come across the same issue before - it's related to the sizeOf function and the poor way that Mule applies precedence to some of it's operators. When you say (sizeOf payload.result > 0) it's first trying to attempt to resolve the payload.result > 0 expression - hence the error you're seeing (it's trying to compare an array to 0).
Please use ((sizeOf payload.result) > 0) instead (I always make a point of wrapping sizeOf in parentheses for this reason).
As a side note, you have batchid:payload.result[0].batchid - it should be batchId:payload.result[0].batchId (capitalisation in batchId)
whenever you are using any function like sizeOf in dataweave try to encapsulate it with round braces to avoid these kinds of errors.
#ghoshyTech in your case
when ((payload.result != null) and ((sizeOf payload.result) > 0))
I have been stuck with this problem for a while now. Couldn't find any solution so far.
Basicall I am doing nothing out of the ordinary.
I have a Method to configure and send a Rest Request using Wslite and this method accepts a map as payload to the closure of the clients' post method like this:
def postSomething(Map payload){
RESTClient client = new RESTClient("http://somedomain.com/path/")
return client.post( accept: ContentType.ANY,
headers: [ callTreeId: uuid, jwt: token ] )
{
json payload
}
}
The passed map comes from another class which is responsible for some transformation and building the map containing the data I want to post.
The maps is structured like this:
Map data =
[
country: "String1",
fulfillingStoreId: "String2",
customerId: "String3",
cardholderId: "String3",
deliveryDate: "String4",
deliveryAddressId: "String5",
serviceType: "String6",
paymentType: "String1",
source: "String7",
origin:
[
system: "String8",
id: "String9"
]
,
contactFirstName: "String10",
contactLastName: "String11",
contactPhoneNumber: "String12",
items: [m_itemList] //a list that holds instances of some item objects
]
def List <Item> m_itemList = []
class Item {
def porperty1 = ""
def porperty2 = ""
def porperty3 = ""
}
Using JsonOutput.prettyPrint(JsonOutput.toJson(data)) prints a nice Json string representation to the console - everything looks as expected.
Now, passing 'data" map to the post-closure (the payload) raises a "java.lang.StackOverflowError"
Stack trace:
Exception in thread "main" java.lang.StackOverflowError
at java.util.HashMap.getNode(Unknown Source)
at java.util.HashMap.get(Unknown Source)
at java.lang.ClassLoader.getPackage(Unknown Source)
at java.lang.Package.getPackage(Unknown Source)
at java.lang.Class.getPackage(Unknown Source)
at wslite.json.JSONObject.wrap(JSONObject.java:1595)
at wslite.json.JSONArray.<init>(JSONArray.java:173)
at wslite.json.JSONObject.wrap(JSONObject.java:1590)
at wslite.json.JSONObject.populateMap(JSONObject.java:1012)
at wslite.json.JSONObject.<init>(JSONObject.java:292)
at wslite.json.JSONObject.wrap(JSONObject.java:1606)
at wslite.json.JSONObject.populateMap(JSONObject.java:1012)
at wslite.json.JSONObject.<init>(JSONObject.java:292)
at wslite.json.JSONObject.wrap(JSONObject.java:1606)
at wslite.json.JSONArray.<init>(JSONArray.java:173)
at wslite.json.JSONObject.wrap(JSONObject.java:1590)
at wslite.json.JSONObject.populateMap(JSONObject.java:1012)
at wslite.json.JSONObject.<init>(JSONObject.java:292)
at wslite.json.JSONObject.wrap(JSONObject.java:1606)
at wslite.json.JSONObject.populateMap(JSONObject.java:1012)
at wslite.json.JSONObject.<init>(JSONObject.java:292)
at wslite.json.JSONObject.wrap(JSONObject.java:1606)
at wslite.json.JSONArray.<init>(JSONArray.java:173)
at wslite.json.JSONObject.wrap(JSONObject.java:1590)
at wslite.json.JSONObject.populateMap(JSONObject.java:1012)
at wslite.json.JSONObject.<init>(JSONObject.java:292)
at wslite.json.JSONObject.wrap(JSONObject.java:1606)
at wslite.json.JSONObject.populateMap(JSONObject.java:1012)
at wslite.json.JSONObject.<init>(JSONObject.java:292)
at wslite.json.JSONObject.wrap(JSONObject.java:1606)
at wslite.json.JSONArray.<init>(JSONArray.java:173)
at wslite.json.JSONObject.wrap(JSONObject.java:1590)
at wslite.json.JSONObject.populateMap(JSONObject.java:1012)
at wslite.json.JSONObject.<init>(JSONObject.java:292)
at wslite.json.JSONObject.wrap(JSONObject.java:1606)
at wslite.json.JSONObject.populateMap(JSONObject.java:1012)
at wslite.json.JSONObject.<init>(JSONObject.java:292)
at wslite.json.JSONObject.wrap(JSONObject.java:1606)
at wslite.json.JSONArray.<init>(JSONArray.java:173)
at wslite.json.JSONObject.wrap(JSONObject.java:1590)
at wslite.json.JSONObject.populateMap(JSONObject.java:1012)
...
...
...
I understand that the contentbuilder of the wslite client accepts a map and i have done it before with other (simpler) requests. So what might be the problem?
Thank you in advance for your contributions.
UPDATE / WORKAROUND SOLUTION:
So after some digging I figured to just re-slurp the built json with the JsonSlurper before passing it to the content building clossure since the "prettyPrinting" the map shows correct results. Voila! No more StackOverFlow Exception.
I now cunstruct the map using the JsonBuilder and parse the result (String) with the JsonSlurper, finally pass this to WSLITE's content builder.
I just had the same issue. Turns out, my payload Map had values in it that were not Strings (they were UUIDs and Enums, which typically auto-toString() nicely). When I manually converted the values to Strings, the error went away.
payload.each { k, v -> payload[k] = v.toString() }
I'm trying to build a power analysis tool in Scala. In particular, I'm trying to create a function which does return the required sample size given some restrictions (statistical power, effect size, the ratio between the two sample sizes, significance level and the type of hypothesis test we would like to run).
I wrote the following code:
import org.apache.commons.math3.analysis.UnivariateFunction
import org.apache.commons.math3.analysis.solvers.BrentSolver
import org.apache.commons.math3.distribution.NormalDistribution
import scala.math._
object normal_sample_size extends App {
val theta1 = 0.0025
val theta2 = theta1 * 1.05
val split = 0.50
val power = 0.80
val significance = 0.05
val alternative = "two sided"
val result = requiredSizeCalculationNormal(effectSizeCalculation(theta1, theta2), split, power, significance, alternative)
println(result)
def effectSizeCalculation(mu1: Double, mu2: Double): Double = {
2 * math.asin(math.sqrt(mu1)) - 2 * math.asin(math.sqrt(mu2))
}
def requiredSizeCalculationNormal(effect_size: Double, split_ratio: Double, power: Double, significance: Double, alternative: String): Unit = {
if (alternative == "two sided") {
val z_score = new NormalDistribution().inverseCumulativeProbability(significance / 2.0)
val func = new UnivariateFunction {def value(n: Double) = (1 - new NormalDistribution().cumulativeProbability(z_score.abs - effect_size * sqrt(n * split_ratio * n * (1 - split_ratio)))) +
new NormalDistribution().cumulativeProbability(z_score - effect_size * sqrt(n * split_ratio * n * (1 - split_ratio)))}
val solver = new BrentSolver()
try {solver.solve(500, func, 0.001, 1000000)} catch {case _:Throwable => None}
} else if (alternative == "less") {
//val z_score = new NormalDistribution().inverseCumulativeProbability(significance)
//new NormalDistribution().cumulativeProbability(z_score - effect_size * sqrt(n * split_ratio * n * (1 - split_ratio)))
} else if (alternative == "greater") {
//val z_score = new NormalDistribution().inverseCumulativeProbability(significance).abs
//1 - new NormalDistribution().cumulativeProbability(z_score - effect_size * sqrt(n * split_ratio * n * (1 - split_ratio)))
} else {
println("Error message: Specify the right alternative. Possible values are 'two sided', 'less', 'greater'")
}
}
}
requiredSizeCalculationNormal() recognises what type of hypothesis testing is dealing with ("two sided", "greater" or "less") and then computes the required sample size. This last point is where I'm struggling! Using Brent's Method (from math3 library) I would like to find the value of n which solve the function.
Going through the code, focusing only on the "two sided" part of my function (I'll complete "greater" and "less" in a second stage), I've tried to create a function named func. Then using the command try {solver.solve(500, func, 0.001, 1000000)} catch {case _:Throwable => None} I'm trying to solve it using max 500 iterations and within the interval (0.001, 1000000). Unfortunately the command doesn't return neither a result nor an error message.
EDIT: I've removed from the code quoted above the //catch {case _:Throwable => None} (None was redundant anyway). That piece of code was actually catching all exceptions and doing nothing with them.
The exceptions I see running the code are:
Exception in thread "main" org.apache.commons.math3.exception.NoBracketingException: function values at endpoints do not have different signs, endpoints: [2, 10,000,000], values: [0.05, 1]
at org.apache.commons.math3.analysis.solvers.BrentSolver.doSolve(BrentSolver.java:118)
at org.apache.commons.math3.analysis.solvers.BaseAbstractUnivariateSolver.solve(BaseAbstractUnivariateSolver.java:190)
at org.apache.commons.math3.analysis.solvers.BaseAbstractUnivariateSolver.solve(BaseAbstractUnivariateSolver.java:195)
at normal_sample_size$.requiredSizeCalculationNormal(normal_sample_size.scala:46)
at normal_sample_size$.delayedEndpoint$normal_sample_size$1(normal_sample_size.scala:31)
at normal_sample_size$delayedInit$body.apply(normal_sample_size.scala:22)
at scala.Function0$class.apply$mcV$sp(Function0.scala:40)
at scala.runtime.AbstractFunction0.apply$mcV$sp(AbstractFunction0.scala:12)
at scala.App$$anonfun$main$1.apply(App.scala:76)
at scala.App$$anonfun$main$1.apply(App.scala:76)
at scala.collection.immutable.List.foreach(List.scala:381)
at scala.collection.generic.TraversableForwarder$class.foreach(TraversableForwarder.scala:35)
at scala.App$class.main(App.scala:76)
at normal_sample_size$.main(normal_sample_size.scala:22)
at normal_sample_size.main(normal_sample_size.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:483)
at com.intellij.rt.execution.application.AppMain.main(AppMain.java:140)
Process finished with exit code 1
So the problem seems to be that the function doesn't change sign within the specified interval.
I have data containing the following:
{"field1":{"data1": 1},"field2":100,"field3":"more data1","field4":123.001}
{"field1":{"data2": 1},"field2":200,"field3":"more data2","field4":123.002}
{"field1":{"data3": 1},"field2":300,"field3":"more data3","field4":123.003}
{"field1":{"data4": 1},"field2":400,"field3":"more data4","field4":123.004}
I uploaded it to S3 and converted it to a Hive table using the following from the Hive console:
ADD JAR s3://elasticmapreduce/samples/hive-ads/libs/jsonserde.jar;
CREATE EXTERNAL TABLE impressions (json STRING ) ROW FORMAT DELIMITED LINES TERMINATED BY '\n' LOCATION 's3://my-bucket/';
The query:
SELECT * FROM impressions;
gives output fine but as soon as I try and use the get_json_object UDF
and run the query:
SELECT get_json_object(impressions.json, '$.field2') FROM impressions;
I get the following error:
> SELECT get_json_object(impressions.json, '$.field2') FROM impressions;
Total MapReduce jobs = 1
Launching Job 1 out of 1
Number of reduce tasks is set to 0 since there's no reduce operator
java.io.IOException: cannot find dir = s3://nick.bucket.dev/snapshot.csv in pathToPartitionInfo: [s3://nick.bucket.dev/]
at org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getPartitionDescFromPathRecursively(HiveFileFormatUtils.java:291)
at org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getPartitionDescFromPathRecursively(HiveFileFormatUtils.java:258)
at org.apache.hadoop.hive.ql.io.CombineHiveInputFormat$CombineHiveInputSplit.<init>(CombineHiveInputFormat.java:108)
at org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getSplits(CombineHiveInputFormat.java:423)
at org.apache.hadoop.mapred.JobClient.writeOldSplits(JobClient.java:1036)
at org.apache.hadoop.mapred.JobClient.writeSplits(JobClient.java:1028)
at org.apache.hadoop.mapred.JobClient.access$700(JobClient.java:172)
at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:944)
at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:897)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059)
at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:897)
at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:871)
at org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:479)
at org.apache.hadoop.hive.ql.exec.MapRedTask.execute(MapRedTask.java:136)
at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:133)
at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:57)
at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1332)
at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1123)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:931)
at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:261)
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:218)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:409)
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:684)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:567)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
Job Submission failed with exception 'java.io.IOException(cannot find dir = s3://my-bucket/snapshot.csv in pathToPartitionInfo: [s3://my-bucket/])'
FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.MapRedTask
Does anyone know what is wrong?
Any reason why you are declaring:
ADD JAR s3://elasticmapreduce/samples/hive-ads/libs/jsonserde.jar;
But not using the serde in your table definition? See the code snippet below on how to use it. I can't see any reason to use get_json_object here.
CREATE EXTERNAL TABLE impressions (
field1 string, field2 string, field3 string, field4 string
)
ROW FORMAT
serde 'com.amazon.elasticmapreduce.JsonSerde'
with serdeproperties ( 'paths'='field1, field2, field3,
field4)
LOCATION 's3://mybucket' ;