Hive hash function resulting in 0,null and 1, why? - mysql

I am using hive 0.13.1 and hashing combination of keys using default hive hash function.
Something like
select hash (date,token1,token2, parameters["a"],parameters["b"], parameters["c"]) from table1;
I ran it on 150M rows. For 60% of the rows, it hashed it correctly. For the remaining rows, it gave 0. null or 1 as hash. I looked at the rows which resulted in bad hashes, I don't see anything wrong with the rows. What could be causing it?

The hash function returns 0 only when all supplied arguments are blank or null.
If you are familiar with Java then you may check implementation of hash function.
The hash function internally uses ObjectInspectorUtils.hashCode to get the hashCode for the supplied fields, use below java code snippet to test manually this issue:
import org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils;
import org.apache.hadoop.hive.serde2.objectinspector.primitive.PrimitiveObjectInspectorFactory;
import org.apache.hadoop.io.Text;
public class TestHash
{
public static void main( String[] args )
{
System.out.println( ObjectInspectorUtils.hashCode(null,PrimitiveObjectInspectorFactory.javaStringObjectInspector) );
System.out.println( ObjectInspectorUtils.hashCode(new Text(""),PrimitiveObjectInspectorFactory.javaStringObjectInspector) );
}
}
Maven dependencies required to run above program:
<dependency>
<groupId>org.apache.hive</groupId>
<artifactId>hive-exec</artifactId>
<version>2.1.0</version>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-common</artifactId>
<version>2.7.2</version>
</dependency>

Related

How to parameterize .json file in rest assured?

I am new to rest assured automation framework, so need help. I have to automate a simple API wherein I send the request in body.
given().log().all().contentType("application/json").body(payload).when().log().all().post("THE
POST URL").then().log().all().assertThat().statusCode(200);
I have to read the request from json file, and I am able to read the request from the .json file successfully. But I want to parameterize the values, and unable to understand on how to parameterize the file. Following is the sample .json file:
{
"id" : 5,
"name" : "Harry"
}
I do not want to hardcode the values of id and name here, but instead parameterize them using data providers or any other method. Any pointers on the same would be helpful.
A good practice for API testing using Rest-Assured is POJO approach. It helps you avoid manipulating json file (one kind of hardcode)
Step 1: You define a POJO
import lombok.AllArgsConstructor;
import lombok.Data;
import lombok.NoArgsConstructor;
#Data
#AllArgsConstructor
#NoArgsConstructor
public class Person {
private int id;
private String name;
}
I use lombok for generating verbose code.
Step 2: Create Data-provider method
#DataProvider(name = "create")
public Iterator<Person> createData() {
Person p1 = new Person(1, "Json");
Person p2 = new Person(2, "James");
Person p3 = new Person(3, "Harry");
return Arrays.asList(p1,p2,p3).iterator();
}
Step 3: Write test
#Test(dataProvider = "create" )
void test1(Person person) {
given().log().all().contentType(JSON)
.body(person)
.post("YOUR_URL")
.then().log().all().assertThat().statusCode(200);
}
You need to add 2 lib into your project classpath to make above code work.
<dependency>
<groupId>org.projectlombok</groupId>
<artifactId>lombok</artifactId>
<version>1.18.20</version>
</dependency>
<dependency>
<groupId>com.fasterxml.jackson.core</groupId>
<artifactId>jackson-databind</artifactId>
<version>2.13.0</version>
</dependency>

How to use default values of Kotlin (1.4.21) data class in Jackson (2.12.0) and Quarkus (1.11.1)

I am using Quarkus 1.11.1 with Kotlin 1.4.21 and Jackson 2.12.0.
I don't understand why when I send a POST request with a body of a data class that has a defined default parameter, this is not accepted and returns an error problem: Parameter specified as non-null is null
In the pom.xml file I have:
<dependency>
<groupId>com.fasterxml.jackson.module</groupId>
<artifactId>jackson-module-kotlin</artifactId>
<version>2.12.0</version>
</dependency>
<dependency>
<groupId>io.quarkus</groupId>
<artifactId>quarkus-resteasy-jackson</artifactId>
</dependency>
<dependency>
<groupId>io.quarkus</groupId>
<artifactId>quarkus-jackson</artifactId>
</dependency>
The Quarkus Documentation says (https://quarkus.io/guides/kotlin#kotlin-and-jackson):
If the com.fasterxml.jackson.module:jackson-module-kotlin dependency and the quarkus-jackson extension (or the quarkus-resteasy-extension) have been added to project, then Quarkus automatically registers the KotlinModule to the ObjectMapper bean (see this guide for more details).
I have a data class like:
data class MyAttributes
#BsonCreator constructor(
#BsonProperty("myId")
#JsonProperty("myId")
var myId: String,
#BsonProperty("name")
#JsonProperty("name")
val name: String,
#BsonProperty("data")
#JsonProperty("data", defaultValue = "{}")
var data: MutableMap<String, Any> = mutableMapOf()
)
I noticed that the defaultValue in the #JsonProperty annotation is not useful, because it is used only to document expected values (https://fasterxml.github.io/jackson-annotations/javadoc/2.12/com/fasterxml/jackson/annotation/JsonProperty.html#defaultValue--)
If I send a JSON like:
{
"myId": "AB123",
"name": "my attribute name"
}
I get the error described previously, and the default value of the data field is ignored.
If I send:
{
"myId": "AB123",
"name": "my attribute name",
"data": {}
}
I don't get an error, because I send also the data field.
Can you tell me where am I doing wrong, please?
Thanks
Do you have a default constructor? By default jackson requires a default constructor. This is not very common with data classes so you can either provide the constructor or you can do something like this:
#Bean
fun objectMapper(): ObjectMapper = ObjectMapper()
.registerKotlinModule()

PowerMockito and Mockito conflict

I need to built unit tests (with junit) for a legacy system. The method that I need to test, makes use of a static method and I need to check if it's called. So, I'll need to use PowerMockito (for "regular" mocking, we use mockito).
But, when I include PowerMockito statements inside the test, Mockito fails with an org.mockito.exceptions.misusing.UnfinishedStubbingException. If I comment the lines PowerMockito.mockStatic(Application.class), PowerMockito.doNothing().when(Application.class) and PowerMockito.verifyStatic(), the UnfinishedStubbingExceptiondoes does not occur, but this way, I'm not able to check if my IllegalArgumentException occured.
The method under test looks like:
public class ClientMB {
public void loadClient(Client client) {
try {
if (client == null) {
throw new IllegalArgumentException("Client is mandatory!");
}
setClient(clientService.findById(client.getId()));
} catch (Exception ex) {
Application.handleException(ex);
}
}
}
And the test looks like:
#PrepareForTest({ Application.class })
#RunWith(PowerMockRunner.class)
public class ClientMBTest {
#Test
public final void testLoadClient() {
ClientService mockedClientService = Mockito.mock(ClientService.class);
Mockito.when(mockedClientService.findById(42L)).thenReturn(new Client());
PowerMockito.mockStatic(Application.class);
PowerMockito.doNothing().when(Application.class);
ClientMB cmb = new ClientMB(mockedClientService);
mb.loadClient(null);
PowerMockito.verifyStatic();
}
}
I imported PowerMokito using the latest version.
<dependency>
<groupId>org.powermock</groupId>
<artifactId>powermock-module-junit4</artifactId>
<version>1.6.2</version>
<scope>test</scope>
</dependency>
<dependency>
<groupId>org.powermock</groupId>
<artifactId>powermock-api-mockito</artifactId>
<version>1.6.2</version>
<scope>test</scope>
</dependency>
What I'm doing wrong? Any advice is welcome.
PowerMockito.doNothing().when(Application.class);
That's a stubbing command, but because you don't make a method call after the when(...), it's unfinished.
PowerMockito.doNothing().when(Application.class);
Application.someApplicationMethod();
You need to use this syntax because the normal doVerb().when(foo) syntax will provide an instance, and Java often issues a warning when trying to call a static method based on an instance instead of a class name.
If you want to stub all of Application's methods, you can do so by passing another argument into mockStatic:
PowerMockito.mockStatic(Application.class, RETURNS_SMART_NULLS);

Class cast exception inside neo4j-spatial code

The following code snippet:
GraphDatabaseService graphDb = new EmbeddedGraphDatabase("var/geo");
// Wrap it as a spatial db service
SpatialDatabaseService spatialDb = new SpatialDatabaseService(graphDb);
// Create the layer to store our spatial data
EditableLayer runningLayer = (EditableLayer) spatialDb.getOrCreateLayer("running", SimplePointEncoder.class, EditableLayerImpl.class, "lon:lat");
fails with the error:
Exception in thread "main" java.lang.ClassCastException: org.neo4j.collections.graphdb.impl.EmbeddedGraphDatabase cannot be cast to org.neo4j.kernel.GraphDatabaseAPI
at org.neo4j.cypher.ExecutionEngine.<init>(ExecutionEngine.scala:113)
at org.neo4j.cypher.javacompat.ExecutionEngine.<init>(ExecutionEngine.java:53)
at org.neo4j.cypher.javacompat.ExecutionEngine.<init>(ExecutionEngine.java:43)
at org.neo4j.collections.graphdb.ReferenceNodes.getReferenceNode(ReferenceNodes.java:60)
at org.neo4j.gis.spatial.SpatialDatabaseService.getSpatialRoot(SpatialDatabaseService.java:76)
at org.neo4j.gis.spatial.SpatialDatabaseService.getLayer(SpatialDatabaseService.java:108)
at org.neo4j.gis.spatial.SpatialDatabaseService.getOrCreateLayer(SpatialDatabaseService.java:202)
at com.bmt.contain.spatial.test.SpatialTest.main(SpatialTest.java:47)
I was trying to get the sample code from here to work, I have included the relevant import statements below, in case I am somehow importing the wrong version of a function.
import org.neo4j.collections.graphdb.impl.EmbeddedGraphDatabase;
import org.neo4j.graphdb.GraphDatabaseService;
import org.neo4j.gis.spatial.EditableLayer;
import org.neo4j.gis.spatial.EditableLayerImpl;
import org.neo4j.gis.spatial.Layer;
import org.neo4j.gis.spatial.encoders.SimplePointEncoder;
Can someone advise me?
Also, its 2.0.1 for both neo4j and v13 for spatial.
<dependency>
<groupId>org.neo4j</groupId>
<artifactId>neo4j-spatial</artifactId>
<version>0.13-neo4j-2.0.1</version>
</dependency>
<dependency>
<groupId>org.neo4j</groupId>
<artifactId>neo4j</artifactId>
<version>2.0.1</version>
</dependency>
So the answer to this question is that you need to use
GraphDatabaseService graphDb = new GraphDatabaseFactory().newEmbeddedDatabase("var/geo");
instead of
GraphDatabaseService graphDb = new EmbeddedGraphDatabase("var/geo");
Somewhere under the hood this creates a different type of Embedded GDbS which doesnt cause a class cast exception.

Fuse ide how to define database table end point

I have heard alot of success integration story when comes to Apache Camel with Fuse. HEnce. here Im just starting to explore the Fuse IDE, with just a simple task on top of my head, i would like to achieve:
Read a fix length file
Parse the fix length file
persist it to mysql database table
I am only able to get as far as:
Read the fix length file (with Endpoint "file:src/data/Japan?noop=true")
Define a Marshal with Bindy and Define a POJO package model with #FixedLengthRecord annotation
then i am stuck... HOW TO persist the POJO into mysql database table? I can see some JDBC, IBatis and JPA end point, but how to accomplish that in Fuse IDE?
My POJO package:
package com.mbww.model;
import org.apache.camel.dataformat.bindy.annotation.DataField;
import org.apache.camel.dataformat.bindy.annotation.FixedLengthRecord;
#FixedLengthRecord(length=91)
public class Japan {
#DataField(pos=1, length=10)
private String TNR;
#DataField(pos=11, length=10)
private String ATR;
#DataField(pos=21, length=70)
private String STR;
}
Well you can use all of the following components to actually read and write from the database:
JDBC
IBATIS
MyBATIS
SPRING-JDBC
SQL
Custom Processor
I am going to show you how to use the custom processor to insert the rows into a table. The main reason for this is that you will get to work with the messages and exchange and this will give you more of a insight into Camel. All of the other components can be used by following the documentation on the camel site.
So lets review what you have. You are reading the file and converting the body to a bindy object. So for each line in your text file Camel will send a bindy object of class com.mbww.model.JAPAN to the next end point. This next end point needs to talk to the database. There is one problem I can spot immediately you are using a marshal you should be using a unmarshal.
The documentation clearly states: If you receive a message from one of the Camel Components such as File, HTTP or JMS you often want to unmarshal the payload into some bean so that you can process it using some Bean Integration or perform Predicate evaluation and so forth. To do this use the unmarshal word in the DSL in Java or the Xml Configuration.
Your bindy class looks good but it is missing getters and setters modify the class to look like this:
package com.mbww.model;
import org.apache.camel.dataformat.bindy.annotation.DataField;
import org.apache.camel.dataformat.bindy.annotation.FixedLengthRecord;
#FixedLengthRecord(length=91)
public class Japan {
#DataField(pos=1, length=10)
private String TNR;
#DataField(pos=11, length=10)
private String ATR;
#DataField(pos=21, length=70)
private String STR;
public String getTNR() {
return TNR;
}
public void setTNR(String tNR) {
TNR = tNR;
}
public String getATR() {
return ATR;
}
public void setATR(String aTR) {
ATR = aTR;
}
public String getSTR() {
return STR;
}
public void setSTR(String sTR) {
STR = sTR;
}
}
First you need to create a data source to your database in your route. First thing is to add the mysql driver jar to your maven dependencies open your pom.xml file and add the following dependency to it.
<dependency>
<groupId>mysql</groupId>
<artifactId>mysql-connector-java</artifactId>
<!-- use this version of the driver or a later version of the driver -->
<version>5.1.25</version>
</dependency>
Right now we need to declare a custom processor to use in the route that will use this driver and insert the received body into a table.
So lets create a new class in Fuse IDE called PersistToDatabase code below:
package com.mbww.JapanData;
import java.sql.DriverManager;
import java.sql.Connection;
import java.sql.PreparedStatement;
import java.sql.SQLException;
import java.util.Map;
import org.apache.camel.Body;
import org.apache.camel.Exchange;
import org.apache.camel.Handler;
import org.apache.camel.Headers;
import com.mbww.model.Japan;
import com.mysql.jdbc.Statement;
public class PersistToDatabase {
#Handler
public void PersistRecord
(
#Body Japan msgBody
, #Headers Map hdr
, Exchange exch
) throws Exception
{
try {
Class.forName("com.mysql.jdbc.Driver");
} catch (ClassNotFoundException e) {
System.out.println("Where is your MySQL JDBC Driver?");
e.printStackTrace();
return;
}
System.out.println("MySQL JDBC Driver Registered!");
Connection connection = null;
try {
connection = DriverManager.getConnection("jdbc:mysql://localhost:3306/databasename","root", "password");
} catch (SQLException e) {
System.out.println("Connection Failed! Check output console");
e.printStackTrace();
return;
}
if (connection != null) {
System.out.println("You made it, take control your database now!");
} else {
System.out.println("Failed to make connection!");
}
try {
PreparedStatement stmt=connection.prepareStatement("INSERT INTO JapanDate(TNR,ATR,STR) VALUES(?,?,?)");
stmt.setString(1, msgBody.getTNR());
stmt.setString(2, msgBody.getATR());
stmt.setString(1, msgBody.getSTR());
int rows = stmt.executeUpdate();
System.out.println("Number of rows inserted: "+Integer.toString(rows));
}
catch(Exception e){
System.out.println("Error in executing sql statement: "+e.getMessage() );
throw new Exception(e.getMessage());
}
}
}
This class is a POJO nothing fancy except the #Handler annotation on the PersistRecord. This annotation tells camel that the PersistRecord method/procedure will handle the message exchange. You will also notice that the method PersistRecord has a parameter of type Japan. As mentioned earlier when you call the conversion bean in your camel route it translates each line into a Japan object and passes it along the route.
The rest of the code is just how to handle the JDBC connection and calling a insert statement.
We are almost done just one last thing to do. We need to declare this class in our camel route xml. This file will typically be called camel-route.xml or blueprint.xml depending on your arch type. Open the source tab and add the following line <bean id="JapanPersist" class="com.mbww.JapanData.PersistToDatabase"/> before the <camelContext> tag.
This declares a new spring bean called JapanPersist based on the class we just added to the camel route. You can now reference this bean inside your camel route.
Thus the final route xml file should look something like this:
<?xml version="1.0" encoding="UTF-8"?>
<blueprint xmlns="http://www.osgi.org/xmlns/blueprint/v1.0.0"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:camel="http://camel.apache.org/schema/blueprint"
xsi:schemaLocation="
http://www.osgi.org/xmlns/blueprint/v1.0.0 http://www.osgi.org/xmlns/blueprint/v1.0.0/blueprint.xsd
http://camel.apache.org/schema/blueprint http://camel.apache.org/schema/blueprint/camel-blueprint.xsd">
<bean id="JapanPersist" class="com.mbww.JapanData.PersistToDatabase"/>
<camelContext trace="false" id="blueprintContext" xmlns="http://camel.apache.org/schema/blueprint">
<route id="JapanDataFromFileToDB">
<from uri="file:src/data/japan"/>
<unmarshal ref="Japan"/>
<bean ref="JapanPersist"/>
</route>
</camelContext>
</blueprint>
Or see screen shot below:
Once you understand this technique you can start scaling the solution by using a splitter, connection pooling and threading to do massive amount of concurrent inserts etc.
Using the technique above you learned how to inject your own beans into a camel route which give you the ability to work with the messages directly in code.
I have not tested the code so there will probably be a bug or two but the idea should be clear.