Spring Data Couchbase - connect to several nodes - couchbase

Java Couchbase client allows connecting to several nodes in a cluster (in case that one of them is not available)
Is is possible in Spring Data Couchbase?
I'm using Couchbase 2.1 and XML configuration for Spring

Yes, you can configure spring-data this way. When you configure the CouchbaseClient using the CouchbaseFactoryBean, it accepts a comma-delimited list of hosts. Here is an example of configuring the CouchbaseClient bean:
<couchbase:couchbase bucket="myBucket" password="" host="host1,host2,host3"/>
This is assuming you are using the 1.4.x couchbase-client.jar dependency, which as long as you are using spring-data 1.1.5, you are fine. You didn't specify your spring-data dependencies, but more than likely you should be good here.

The only way to do this in spring data couchbase > 2.x is:
A cluster with three servers and three buckets each one with an user and a password.
<couchbase:cluster id="cluster_info" env-ref="couchbaseEnv2" >
<couchbase:node>server1</couchbase:node>
<couchbase:node>server2</couchbase:node>
<couchbase:node>server3</couchbase:node>
</couchbase:cluster>
<couchbase:env id="couchbaseEnv2" connectTimeout="20000" computationPoolSize="10" />
<couchbase:clusterInfo cluster-ref="cluster_info" id="cluster1" login="user1" password="zzzzz1"/>
<couchbase:clusterInfo cluster-ref="cluster_info" id="cluster2" login="user2" password="zzzzz2"/>
<couchbase:clusterInfo cluster-ref="cluster_info" id="cluster3" login="user3" password="zzzzz3"/>
<couchbase:bucket id="bucket1" bucketName="user1" cluster-ref="cluster_info" bucketPassword="zzzzz1"/>
<couchbase:bucket id="bucket2" bucketName="user2" cluster-ref="cluster_info" bucketPassword="zzzzz2"/>
<couchbase:bucket id="bucket3" bucketName="user3" cluster-ref="cluster_info" bucketPassword="zzzzz3"/>
<couchbase:template id="couchBaseTemplate1" bucket-ref="bucket1" clusterInfo-ref="cluster1" />
<couchbase:template id="couchBaseTemplate2" bucket-ref="bucket2" clusterInfo-ref="cluster2" />
<couchbase:template id="couchBaseTemplate3" bucket-ref="bucket3" clusterInfo-ref="cluster3" />

Related

Read list of strings from MySQL Stored Proc in .NET 6

I have a MySQL (not SQL Server) database with a Stored Procedure that returns a tabular result with one (1) column of strings.
I would like to get that result into my .NET application as some sort of IEnumerable<string> or List<string> etc.
What do?
I've tried playing with MySql.EntityFrameworkCore but get stuck quickly. Entity Framework Core either wants to generate tables based on models or models based on tables. I want neither. I just want my strings, plain and simple.
I've tried making a POCO with a single property and the [Keyless] attribute but no dice. If I define a DbSet<Poco> then the table doesn't exist, if I try to do context.Set<Poco>().FromSql('call my_stored_proc();'); then EF core complains the DbSet doesn't exist.
I'm using .NET 6 and the latest versions of above mentioned MySQL EntityFrameworkCore NuGet. Searching for answers is made harder by a lot of answers either assuming SQL Server or using older versions of EF core with methods that my EF core doesn't seem to have. And some results claim that EF core 6 doesn't work with .NET 6?
I'm also happy bypassing EF entirely if that's easier.
What you are asking for will eventually be available in EF Core 7.0 - Raw SQL queries for unmapped types.
Until then, the minimum you need to do is to define a simple POCO class with single property, register it as keyless entity and use ToView(null) to avoid EF Core associate a db table/view with it.
e.g.
POCO:
public class StringValue
{
public string Value { get; set; }
}
Your DbContext subclass:
protected override void OnModelCreating(ModelBuilder modelBuilder)
{
modelBuilder.Entity<StringValue>(builder =>
{
builder.HasNoKey(); // keyless
builder.ToView(null); // no table/view
builder.Property(e => e.Value)
.HasColumnName("column_alias_in_the_sp_result");
});
}
Usage:
var query = context.Set<StringValue>()
.FromSqlRaw(...)
.AsEnumerable() // needed if SP call raw SQL is not composable as for SqlServer
.Select(r => r.Value);
EF is an ORM, not a data access library. The actual data access is performed by ADO.NET using db-specific providers. You don't need an ORM to run a query and receive results :
await using DbConnection connection = new MySqlConnection("Server=myserver;User ID=mylogin;Password=mypass;Database=mydatabase");
await connection.OpenAsync();
using DbCommand command = new MySqlCommand("SELECT field FROM table;", connection);
await using var reader = command.ExecuteReader();
while (await reader.ReadAsync())
Console.WriteLine(reader.GetString(0));
ADO.NET provides the interfaces and base implementations like DbConnection. Individual providers provide the data-specific implementations. This means the samples you see for SQL Server work with minimal modifications for any other database. To execute a stored procedure you need to set the CommandType to System.Data.CommandType.StoredProcedure :
using var command = new MySqlCommand("my_stored_proc", connection);
command.CommandType=CommandType.StoredProcedure
In this case I used the open source MySqlConnector provider, which offers true asynchronous commands and fixes a lot of the bugs found in Oracle's official Connector/.NET aka MySQL.Data. The official MySql.EntityFrameworkCore uses the official provider and inherits its problems.
ORMs like EF and micro-ORMs like Dapper work on top of ADO.NET to generate SQL queries and map results to objects. To work with EF Core use Pomelo.EntityFrameworkCore.MySql. With 25M downloads it's also far more popular than MySql.EntityFrameworkCore (1.4M).
If you only want to map raw results to objects, try Dapper. It constructs the necessary commands based on the query and parameters provided as anonymous objects, opens and closes connections as needed, and maps results to objects using reflection. Until recently it was a lot faster than EF Core in raw queries but EF caught up in EF Core 7 :
IEnumerable<User> users = cnn.Query<User>("get_user",
new {RoleId = 1},
commandType: CommandType.StoredProcedure);
No other configuration is needed. Dapper will map columns to User parameters by name and return an IEnumerable<T> of the desired classes.
The equivalent functionality will be added in EF Core 7's raw SQL queries for unmapped types

Can i watch only one field in Couchbase with kafka-connect (CDC)?

We are trying to move our database from mysql to couchbase and implement some CDC (change data capture) logic for copying data to our new db.
All enviroments set up and running. mysql, debezium, kafka, couchbase, kubernetes, pipeline etc. And also we are set up our kafka-source connector for debezium. here it is:
- name: "our-connector"
config:
connector.class: "io.debezium.connector.mysql.MySqlConnector"
tasks.max: "1"
group.id: "our-connector"
database.server.name: "our-api"
database.hostname: "******"
database.user: "******"
database.password: "******"
database.port: "3306"
database.include.list: "our_db"
column.include.list: "our_db.our_table.our_field"
table.include.list: "our_db.our_table"
database.history.kafka.topic: "inf.our_table.our_db.schema-changes"
database.history.kafka.bootstrap.servers: "kafka-cluster-kafka-bootstrap.kafka:9092"
value.converter: "org.apache.kafka.connect.json.JsonConverter"
value.converter.schemas.enable: "false"
key.converter: "org.apache.kafka.connect.json.JsonConverter"
key.converter.schemas.enable: "false"
snapshot.locking.mode: "none"
tombstones.on.delete: "false"
event.deserialization.failure.handling.mode: "ignore"
database.history.skip.unparseable.ddl: "true"
include.schema.changes: "false"
snapshot.mode: "initial"
transforms: "extract,filter,unwrap"
predicates: "isOurTableChangeOurField"
predicates.isOurTableChangeOurField.type: "org.apache.kafka.connect.transforms.predicates.TopicNameMatches"
predicates.isOurTableChangeOurField.pattern: "our-api.our_db.our_table"
transforms.filter.type: "com.redhat.insights.kafka.connect.transforms.Filter"
transforms.filter.if: "!!record.value() && record.value().get('op') == 'u' && record.value().get('before').get('our_field') != record.value().get('after').get('our_field')"
transforms.filter.predicate: "isOurTableChangeOurField"
transforms.unwrap.type: "io.debezium.transforms.ExtractNewRecordState"
transforms.unwrap.drop.tombstones: "false"
transforms.unwrap.delete.handling.mode: "drop"
transforms.extract.type: "org.apache.kafka.connect.transforms.ExtractField{{.DOLLAR_SIGN}}Key"
transforms.extract.field: "id"
and this configuration publish this message to kafka. captured from kowl.
as you can see we have original records id and changed fields new value.
no problem so far. Actually we have problem :) Our field is DATETIME type in mysql, but debezium publish it as unixtime.
First question how can we publish this with formatted datetime (YYYY-mm-dd HH:ii:mm for example)?
lets move on.
here is the actual problem. we have searched a lot but all examples are recording whole data to couchbase. but we already created this record in couchbase, just want to data up to date. actually we manipulated data also.
here is example data from couchbase
we want to change only bill.dateAccepted field in couchbase. tried some yaml config but no success on sink.
here is are sink config
- name: "our-sink-connector-1"
config:
connector.class: "com.couchbase.connect.kafka.CouchbaseSinkConnector"
tasks.max: "2"
topics: "our-api.our_db.our_table"
couchbase.seed.nodes: "dev-couchbase-couchbase-cluster.couchbase.svc.cluster.local"
couchbase.bootstrap.timeout: "10s"
couchbase.bucket: "our_bucket"
couchbase.topic.to.collection: "our-api.our_db.our_table=our_bucket._default.ourCollection"
couchbase.username: "*******"
couchbase.password: "*******"
key.converter: "org.apache.kafka.connect.storage.StringConverter"
key.converter.schemas.enable: "false"
value.converter: "org.apache.kafka.connect.json.JsonConverter"
value.converter.schemas.enable: "false"
connection.bucket : "our_bucket"
connection.cluster_address: "couchbase://couchbase-srv.couchbase"
couchbase.document.id: "${/id}"
Partial answer to your first question. One approach would be that You can use an SPI converter to convert the unixdatetime to string. if you want to convert all the datetimes and your input message contains many datetime fields, you can just look at the JDBCType and do the conversion
https://debezium.io/documentation/reference/stable/development/converters.html
As for extracting I/U , you can write a custom SMT (Single message transform) that has before and after records and also has the operation type (I/U/D) and comparing before and after fields extract the delta. In the past when i tried something for this , I bumped upon the following which came in quite handy as a reference. This way you have a delta field and a key and that can just update instead of updating the full document (though the sink has to support it will come in at some point)
https://github.com/michelin/kafka-connect-transforms-qlik-replicate
The Couchbase source connector does not support watching individual fields. In general, the Couchbase source connector is better suited for replication than for change data capture. See the caveats mentioned in the Delivery Guarantees documentation.
The Couchbase Kafka sink connector supports partial document updates via the built-in SubDocumentSinkHandler or N1qlSinkHandler. You can select the sink handler by configuing the couchbase.sink.handler connector config property, and customize its behavior with the Sub Document Sink Handler config options.
Here's a config snippet that tells the connector to update the bill.dateAccepted property with the entire value of the Kafka record. (You'd also need to use a Single Message Transform to extract just this field from the source record.)
couchbase.sink.handler=com.couchbase.connect.kafka.handler.sink.SubDocumentSinkHandler
couchbase.subdocument.path=/bill/dateAccepted
If the built-in sink handlers are not flexible enough, you can write your own custom sink handler using the CustomSinkHandler.java example as a template.

OpenDDS and OpenSplice interoperability

I have two programs, one using OpenSplice 6.7.1 and the other using OpenDDS 3.10.
They are both using RTPS as protocol, the same domain id and the destination port (I verified using wireshark).
The problem is that they are not communicating.
I don't know if I am doing anything wrong with the config... I am using the basic config for OpenDDS with RTPS and for OpenSplice I used the provided ospl.xml after changing the domain ID.
Here are my config files.
For OpenDDS:
[common]
DCPSGlobalTransportConfig=$file
DCPSDefaultDiscovery=DEFAULT_RTPS
[transport/the_rtps_transport]
transport_type=rtps_udp
For OpenSplice:
<OpenSplice>
<Domain>
<Name>ospl_sp_ddsi</Name>
<Id>223</Id>
<SingleProcess>true</SingleProcess>
<Description>Stand-alone 'single-process' deployment and standard DDSI networking.</Description>
<Service name="ddsi2">
<Command>ddsi2</Command>
</Service>
<Service name="durability">
<Command>durability</Command>
</Service>
<Service name="cmsoap">
<Command>cmsoap</Command>
</Service>
</Domain>
<DDSI2Service name="ddsi2">
<General>
<NetworkInterfaceAddress>AUTO</NetworkInterfaceAddress>
<AllowMulticast>true</AllowMulticast>
<EnableMulticastLoopback>true</EnableMulticastLoopback>
<CoexistWithNativeNetworking>false</CoexistWithNativeNetworking>
</General>
<Compatibility>
<!-- see the release notes and/or the OpenSplice configurator on DDSI interoperability -->
<StandardsConformance>lax</StandardsConformance>
<!-- the following one is necessary only for TwinOaks CoreDX DDS compatibility -->
<!-- <ExplicitlyPublishQosSetToDefault>true</ExplicitlyPublishQosSetToDefault> -->
</Compatibility>
</DDSI2Service>
<DurabilityService name="durability">
<Network>
<Alignment>
<TimeAlignment>false</TimeAlignment>
<RequestCombinePeriod>
<Initial>2.5</Initial>
<Operational>0.1</Operational>
</RequestCombinePeriod>
</Alignment>
<WaitForAttachment maxWaitCount="100">
<ServiceName>ddsi2</ServiceName>
</WaitForAttachment>
</Network>
<NameSpaces>
<NameSpace name="defaultNamespace">
<Partition>*</Partition>
</NameSpace>
<Policy alignee="Initial" aligner="true" durability="Durable" nameSpace="defaultNamespace"/>
</NameSpaces>
</DurabilityService>
<TunerService name="cmsoap">
<Server>
<PortNr>Auto</PortNr>
</Server>
</TunerService>
</OpenSplice>
What am I doing wrong ?
Multi-vendor interoperability has been demonstrated repeatedly at OMG events but not recently, so maybe a regression has happened with/in either of the products.
Your OpenSplice configuration is (apart from domainId which should match the one used in your application where typically users use DDS::DOMAIN_ID_DEFAULT to indicate they want to use the ID as specified in the configuration as pointed to by the OSPL_URI environment variable) a proper default configuration. I'm sure you are aware that the AUTO setting of the to-be-used interface/IP-address is a potential source-of-confusion if you use multi-homed machines.
So next would be to look at both (DDSI)traces and/or wireshark captures and see if you spot DDSI wire-frames for both Vendors (1.2 for PrismTech, 1.3 for OCI).
When for instance there's no sign of vendor-1.3 being identified in OpenSplice DDSI-traces then that suggests there's still some 'fundamental' communication issues.
Note that at these OMG-events we typically used the (for us 'bundled') iShapes example on domain '0' and module-less IDL topic-type specification to verify interoperability, so it it doesn't work for your application that's something worth trying too (and check/use wireshark in combination with that example too)
I'll also keep watching the community-forum for new information on this ..

Number of lines read with Spring Batch ItemReader

I am using SpringBatch to write a csv-file to the database. This works just fine.
I am using a FlatFileItemReader and a custom ItemWriter. I am using no processor.
The import takes quite some time and on the UI you don't see any progress. I implemented a progress bar and got some global properties where i can store some information (like lines to read or current import index).
My question is: How can i get the number of lines from the csv?
Here's my xml:
<batch:job id="importPersonsJob" job-repository="jobRepository">
<batch:step id="importPersonStep">
<batch:tasklet transaction-manager="transactionManager">
<batch:chunk reader="personItemReader"
writer="personItemWriter"
commit-interval="5"
skip-limit="10">
<batch:skippable-exception-classes>
<batch:include class="java.lang.Throwable"/>
</batch:skippable-exception-classes>
</batch:chunk>
<batch:listeners>
<batch:listener ref="skipListener"/>
<batch:listener ref="chunkListener"/>
</batch:listeners>
</batch:tasklet>
</batch:step>
<batch:listeners>
<batch:listener ref="authenticationJobListener"/>
<batch:listener ref="afterJobListener"/>
</batch:listeners>
</batch:job>
I already tried to use the ItemReadListener Interface, but this isn't possible as well.
if you need to know how many lines where read, it's available in spring batch itself,
take a look at the StepExecution
The method getReadCount() should give you the number you are looking for.
You need to add a step execution listener to your step in your xml configuration. To do that (copy/pasted from spring documentation):
<step id="step1">
<tasklet>
<chunk reader="reader" writer="writer" commit-interval="10"/>
<listeners>
<listener ref="chunkListener"/>
</listeners>
</tasklet>
</step>
where "chunkListner" is a bean of yours annotated with a method annotated with #AfterStep to tell spring batch to call it after your step.
you should take a look at the spring reference for step configuration
Hope that helps,

MySQL with Symfony2

I don't want to use Symfony2 doctrine. Instead want to write own data classes to handle MySQL queries. So is there any way that directly sql queries can be executed. Most article in google talks about Doctrine or MySQL+Doctrine.
If you don't want to use Doctrine ORM or even Doctrine DBAL, absolutley nothing stopes you from using PDO/MySQLi directly.
Define PDO instance as DIC service:
<service id="pdo" class="PDO">
<argument>dns</argument>
<argument>user</argument>
<argument>password</argument>
<call method="setAttribute">
<argument>2</argument> <!-- use exception for error handling -->
</call>
</service>
Pass PDO instance for each service that requires database connection:
<service id="my.custom.service" class="My\Custom\Service">
<argument type="service" id="pdo" />
</serivce>
---
namespace My\Custom;
class Service {
public function __construct(PDO $pdo) { }
}
There's a cookbook about using Doctrine's DBAL Layer.