I am currently sending compressed messages through Kafka Producer by enabling the compression using following properties:
compression.type="gzip"
compressed.topics="Topic_A, Topic_B"
When the kafkaproducer object is instantiated the following list of properties are listed in ProducerConfig object:
{compression.type=gzip, metric.reporters=[], metadata.max.age.ms=300000, metadata.fetch.timeout.ms=60000, reconnect.backoff.ms=50, sasl.kerberos.ticket.renew.window.factor=0.8, bootstrap.servers=[MyServer_IP:9092], retry.backoff.ms=100, sasl.kerberos.kinit.cmd=/usr/bin/kinit, buffer.memory=33554432, timeout.ms=30000, key.serializer=class org.apache.kafka.common.serialization.StringSerializer, sasl.kerberos.service.name=null, sasl.kerberos.ticket.renew.jitter=0.05, ssl.keystore.type=JKS, ssl.trustmanager.algorithm=PKIX, block.on.buffer.full=false, ssl.key.password=null, max.block.ms=60000, sasl.kerberos.min.time.before.relogin=60000, connections.max.idle.ms=540000, ssl.truststore.password=null, max.in.flight.requests.per.connection=5, metrics.num.samples=2, client.id=, ssl.endpoint.identification.algorithm=null, ssl.protocol=TLS, request.timeout.ms=30000, ssl.provider=null, ssl.enabled.protocols=[TLSv1.2, TLSv1.1, TLSv1], acks=1, batch.size=16384, ssl.keystore.location=null, receive.buffer.bytes=32768, ssl.cipher.suites=null, ssl.truststore.type=JKS, security.protocol=PLAINTEXT, retries=0, max.request.size=1048576, value.serializer=class org.apache.kafka.common.serialization.ByteArraySerializer, ssl.truststore.location=null, ssl.keystore.password=null, ssl.keymanager.algorithm=SunX509, metrics.sample.window.ms=30000, partitioner.class=class org.apache.kafka.clients.producer.internals.DefaultPartitioner, send.buffer.bytes=131072, linger.ms=0}
Problem:
In this list "compression.type=gzip" is clearly set as required but "compressed.topics" is missing. It results in enabling compressions for all topics but I need it on selective ones.
Findings:
I debug the code and came to know that in ProducerConfig.java class "compressed.topics" property is not defined due to that when KafkaProducer object is instantiated it does not have the required property.
Related
I have 2 phonograph objects, each one having millions of rows, which I have linked by using the Search Around methods.
On the example below, I filter to an Object Set of Flights based on the departure code, then I Search Around to the Passengers on those flights and then I filter again based on an attribute of Passengers Object.
const passengersDepartingFromAirport = Objects.search()
.flights()
.filter(flight => flight.departureAirportCode.exactMatch(airportCode))
.searchAroundPassengers()
.filter(passenger => passenger.passengerAttribute.exactMatch(value));
The result of the above code is:
LOG [2022-04-19T14:25:58.182Z] { osp: {},
objectSet:
{ objectSetProvider: '[Circular]',
objectSet: { type: 'FILTERED', filter: [Object], objectSet: [Object] } },
objectTypeIds: [ 'passengers' ],
emptyOrderByStep:
{ objectSet: '[Circular]',
orderableProperties:
{ attributeA: [Object],
attributeB: [Object],
attributeB: [Object],
...
Now, when I am trying to use take() or takeAsync() or to aggregate the result using groupBy(), I receive the below error:
RemoteError: INVALID_ARGUMENT ObjectSet:ObjectSetTooLargeForSearchAround with instance ID xxx.
Error Parameters: {
"RemoteError.type": "STATUS",
"objectSetSize": "2160870",
"maxAllowedSize": "100000",
"relationSide": "TARGET",
"relationId": "flights-passengers"
}
SafeError: RemoteError: INVALID_ARGUMENT ObjectSet:ObjectSetTooLargeForSearchAround with instance ID xxx
What could be the way to aggregate or to reduce the result of the above ObjectSet?
The current object storage infrastructure has a limit on the size of the "left side" or "starting object set" for a search around of 100,000 objects.
You can define and object set that uses a search around, which is what you're seeing as the result when you execute the Function before attempting any further manipulations.
Using take() or groupBy "forces" the resolution of the object set definition. I.e. you no longer need the pointer to the objects, but you need to actually materialize some data from each individual object to do that operation.
It's in this materialization step that the limit comes into play - the object sets are resolved and, if the object set at the search around step is larger than 100,000 objects, the request will fail with the above message.
There is ongoing work for Object Storage v2, which will eventually support much larger search-around requests, but for now it's necessary create a query pattern that results in less than 100,000 objects before making a search around.
In some cases it's possible to create an "intermediate" object type that represents a different level of granularity in your data or two invert the direction of your search around to find a way to address these limits.
I have defined a list for operational data in yang model as:
list listener-state {
key “listener-name”;
config false;
description
“common statistics for given listener (i.e sent messages)”;
uses listener-state-info;
…
}
I use opendaylight api (org.opendaylight.yangtools.yang.data.codec.gson.JsonParserStream) which will convert the json body in request to org.opendaylight.yangtools.yang.data.api.schema.NormalizedNode, in order to finally generate the XML rpc for confd server.
In my case, I want to fetch all rows from this operation list, then I try to make the json as :
“command”: {“service” : {“server” : {“listener-state” : {}}}},
I will get exception that : “Input is missing some of the keys of listener-state”
Then I can add the key value to the json body :
“command”: {“service” : {“server” : {“listener-state” : {“listener-name”: “first”}}}},
This case, I can only get one row. I also try to leave the key value as blank:
“command”: {“service” : {“server” : {“listener-state” : {“listener-name”: “”}}}},
Then the response will be all key values instead of all rows. So now my question is what the json will be in order to get all rows in the list without knowing the key values ?
This should be feasible since I figure out XML request can do that. But I can't figure out what the matching json will be.
Thanks.
I did bunch of investigation. Unfortunately, I don't think there is a way to fetch the whole table
I have an object structure with 3 objects. location > lochierarchy > customtable.
On the original source xml -erdata, I get only details for location object. I have derived the information for lochierarchy and the customtable.
If I have at least one column value for lochierarchy and customtable, I am able to use the following code and fill up the derived values.
xml
<LOCATIONS>
<location>1000</location>
<siteid>xyg</siteid>
<LOCHIERARCHY>
<SYSTEMID>abdc</SYSTEMID>
<PARENT></PARENT>
<CUSTOMTABLE>
<DEPT>MECHANICAL</DEPT>
<OWNER></OWNER>
</CUSTOMTABLE>
</LOCHIERARCHY>
List locHierarchyList =irData.getChildrenData("LOCHIERARCHY");
int locHrSize=locHierarchyList.size();
for (int i=0;i<locHrSize;i++)
{
irData.setAsCurrent(locHierarchyList,i);
irData.setCurrentData("PARENT","xyyyyg");
List customTablerList =irData.getChildrenData("CUSTOMTABLE");
int custSize=customTablerList .size();
for (int i=0;i<custSize;i++)
{
//set values
}
But I am getting the source xml with only the location data below and I'm trying to build the children data. I am missing something here.
Incoming XML
<LOCATIONS>
<location>1000</location>
<siteid>xyg</siteid>
</LOCATIONS>
My Code
irData.createChildrenData("LOCHIERARCHY");
irData.setAsCurrent();
irData.setCurrentData("SYSTEMID", SYSTEM);
irData.setCurrentData("PARENT", parentLoc);
irData.createChildrenData("CUSTOMTABLE");
irData.setAsCurrent();
but this is not working. Can anyone help me out?
got it, just had to use another method of createChildrenData.
irData.createChildrenData("LOCHIERARCHY",true);
This one did the trick. It creates the child set and make it as current.
What is the behavior of DynamoDb BatchGetItem API if none of the keys exist in dynamodb?
Does it returns an empty list or throws an exception?
I am not sure about this after reading their doc: link
but I may be missing something.
BatchGetItem will not throw an exception. The results for those items will not be present in the Responses map in the response. This is also stated in the BatchGetItem documentation:
If a requested item does not exist, it is not returned in the result.
Requests for nonexistent items consume the minimum read capacity units
according to the type of read. For more information, see Capacity
Units Calculations in the Amazon DynamoDB Developer Guide.
This behavior is also easy to verify. This is for a Table with a hash key attribute named customer_id (the full example I am using is here):
dynamoDB.batchGetItem(new BatchGetItemSpec()
.withTableKeyAndAttributes(new TableKeysAndAttributes(EXAMPLE_TABLE_NAME)
.withHashOnlyKeys("customer_id", "ABCD", "EFGH")
.withConsistentRead(true)))
.getTableItems()
.entrySet()
.stream()
.forEach(System.out::println);
dynamoDB.batchGetItem(new BatchGetItemSpec()
.withTableKeyAndAttributes(new TableKeysAndAttributes(EXAMPLE_TABLE_NAME)
.withHashOnlyKeys("customer_id", "TTTT", "XYZ")
.withConsistentRead(true)))
.getTableItems()
.entrySet()
.stream()
.forEach(System.out::println);
Output:
example_table=[{ Item: {customer_email=jim#gmail.com, customer_name=Jim, customer_id=ABCD} }, { Item: {customer_email=garret#gmail.com, customer_name=Garret, customer_id=EFGH} }]
example_table=[]
I'm trying to use Talend from a REST web service and by passing parameters that come from a DB.
I̶ ̶a̶m̶ ̶t̶r̶y̶i̶n̶g̶ ̶t̶o̶ ̶l̶i̶n̶k̶ ̶t̶P̶o̶s̶g̶r̶e̶s̶q̶l̶I̶n̶p̶u̶t̶ ̶c̶o̶m̶p̶o̶n̶e̶n̶e̶t̶ ̶t̶o̶ ̶t̶h̶e̶ ̶t̶R̶e̶s̶t̶ ̶c̶o̶m̶p̶o̶n̶e̶n̶t̶ ̶a̶n̶d̶ ̶t̶o̶ ̶s̶e̶e̶ ̶h̶o̶w̶ ̶t̶o̶ ̶p̶a̶s̶s̶ ̶D̶B̶ ̶r̶o̶w̶s̶ ̶v̶a̶l̶u̶e̶ ̶i̶n̶ ̶t̶h̶e̶ ̶U̶R̶L̶,̶ ̶b̶u̶t̶ ̶i̶t̶s̶ ̶s̶e̶e̶m̶s̶ ̶t̶h̶a̶t̶ ̶t̶R̶E̶S̶T̶ ̶d̶o̶e̶s̶ ̶n̶o̶t̶ ̶a̶c̶c̶e̶p̶t̶ ̶t̶h̶i̶n̶g̶s̶ ̶l̶i̶k̶e̶ ̶t̶h̶i̶s̶.̶
T̶h̶i̶s̶ ̶i̶s̶ ̶w̶a̶h̶t̶ ̶I̶ ̶d̶i̶d̶ ̶u̶n̶t̶i̶l̶ ̶t̶h̶i̶s̶ ̶t̶i̶m̶e̶ ̶:̶
̶t̶P̶o̶s̶g̶r̶e̶s̶q̶l̶I̶n̶p̶u̶t̶ ̶x̶x̶x̶x̶x̶x̶ ̶t̶R̶E̶S̶T̶ ̶-̶-̶-̶>̶ ̶t̶E̶x̶t̶r̶a̶c̶t̶J̶S̶O̶N̶F̶i̶e̶l̶d̶s̶ ̶-̶-̶-̶>̶ ̶t̶M̶a̶p̶ ̶-̶-̶-̶>̶ ̶t̶P̶o̶s̶g̶r̶e̶s̶q̶l̶O̶u̶t̶p̶u̶t̶
I verified that the DB component returns data below:
I updated the job as this:
The Schema of tRESTClient is :
And I used globalMap to pass the values from the database:
The used URL is : "URL/search/"+ (String)globalMap.get("row1.hashtag")
But when I see the results, I found it used the "null" value to request the server.
All I needed is to use a tFlowToIterate component to iterate each row, so that we will able to access the input data extracted from DB and set a dynamic URL.
For example as I have before, the job must look like this :
BBin --main(row1)--> tFlowToITerate --iterate--> tREST ---> tExtractJSONFields ---> tMap ---> DBout
On tRest, we can set a dynamic URL like:
"SOME_URL/otherpath/"+(String)globalMap.get("row1.columnName")