fluent and webhdfs filename with 197001011 - fluent

I run td-agent on ubuntu 14.04 with the follow configuration:
<source>
type tail
format json
path /path/tomcat/logs/file-input.log
tag bhc.hdfs
pos_file /var/td-agent/file.pos
</source>
<match bhc.hdfs>
type webhdfs
port 50070
host my.host.name
path /hdfs/path/file.${hostname}.%Y%m%d.log
username user
flush_interval 10s
output_include_time false
output_include_tag false
output_data_type json
</match>
Log source files in directory /path/tomcat/logs/file-input.log contain only a structured json data.
Ntp daemon is installed and running but when td-agent creates file in hdfs date on filename is 19700101.
What's wrong?

Fluentd records has time, and webhdfs plugin creates files with that records' timestamp, not current time.
tail plugin uses field named as time for time of record in default. If your log data have any other time information field, you can specify it with time_key and time_format.
See also: http://docs.fluentd.org/articles/in_tail

Related

Keycloak on kubernetes and logging json layout format with log4j2

I have Keycloak deployed in Kubernetes using the official codecentric chart. Now I want to make Keycloak logs into json format in order to export them to Kibana.
A comment to the original reply pointed to a cli command to do this.
cli:
# Custom CLI script
custom: |
/subsystem=logging/json-formatter=json:add(exception-output-type=formatted, pretty-print=false, meta-data={label=value})
/subsystem=logging/console-handler=CONSOLE:write-attribute(name=named-formatter, value=json)
It is a Java application that is running on Wildfly. If you check the main process that is running inside the pod, you will see something like:
/usr/lib/jvm/java/bin/java -D[Standalone] -server -Xms64m -Xmx512m -XX:MetaspaceSize=96M -XX:MaxMetaspaceSize=256m -Djava.net.preferIPv4Stack=true -Djboss.modules.system.pkgs=org.jboss.byteman -Djava.awt.headless=true -Dorg.jboss.boot.log.file=/opt/jboss/keycloak/standalone/log/server.log -Dlogging.configuration=file:/opt/jboss/keycloak/standalone/configuration/logging.properties -jar /opt/jboss/keycloak/jboss-modules.jar -mp /opt/jboss/keycloak/modules org.jboss.as.standalone -Djboss.home.dir=/opt/jboss/keycloak -Djboss.server.base.dir=/opt/jboss/keycloak/standalone -Djboss.bind.address=10.217.0.231 -Djboss.bind.address.private=10.217.0.231 -b 0.0.0.0 -c standalone.xml
Important part here is the following:
-Dlogging.configuration=file:/opt/jboss/keycloak/standalone/configuration/logging.properties
So, the logging configuration is passed to the Java process as a JVM option, and read from the file on the path /opt/jboss/keycloak/standalone/configuration/logging.properties.
If you check the content of the file, it has a section like the following:
...
handler.CONSOLE=org.jboss.logmanager.handlers.ConsoleHandler
handler.CONSOLE.level=INFO
handler.CONSOLE.formatter=COLOR-PATTERN
handler.CONSOLE.properties=autoFlush,target,enabled
handler.CONSOLE.autoFlush=true
handler.CONSOLE.target=SYSTEM_OUT
handler.CONSOLE.enabled=true
...
You need to figure out what to change in this logging configuration to meet your JSON requirements. An example would be:
formatter.json=org.jboss.logmanager.formatters.JsonFormatter
formatter.json.properties=keyOverrides,exceptionOutputType,metaData,prettyPrint,printDetails,recordDelimiter
formatter.json.constructorProperties=keyOverrides
formatter.json.keyOverrides=timestamp\=#timestamp
formatter.json.exceptionOutputType=FORMATTED
formatter.json.metaData=#version\=1
formatter.json.prettyPrint=false
formatter.json.printDetails=false
formatter.json.recordDelimiter=\n
Then, in Kubernetes you can create a ConfigMap with the logging config that you want, define it as a volume in your pod/deployment, and mount it as a file to that exact path in the pod/deployment definition. If you do all steps correctly, you should be able to customize the logging format as you need.

Failed loading positionFile: while using TAILDIR Source in flume i am getting error

I working on Flume to append the data from a local directory to HDFS using Flume Source TAILDIR.
My use case is to do Delta Load If the new line comes in the source file in local dir so that will append in hdfs.
This is my Flume Conf file :
#configure the agent
agent.sources=r1
agent.channels=k1
agent.sinks=c1
agent.sources.r1.type=TAILDIR
agent.sources.r1.positionFile = /home/flume/Documents/taildir_position.json
agent.sources.r1.filegroups=f1
agent.sources.r1.filegroups.f1=/home/flume/Documents/spooldir/
agent.sources.r1.batchSize = 20
agent.sources.r1.writePosInterval=2000
agent.sources.r1.maxBackoffSleep=5000
agent.sources.r1.fileHeader = true
agent.sources.r1.channels=k1
agent.channels.k1.type=memory
agent.channels.k1.capacity=10000
agent.channels.k1.transactionCapacity=1000
agent.sinks.c1.type=hdfs
agent.sinks.c1.channel=k1
agent.sinks.c1.hdfs.path=hdfs://localhost:8020/flume_sink
agent.sinks.c1.hdfs.batchSize = 1000
agent.sinks.c1.hdfs.rollSize = 268435456
agent.sinks.c1.hdfs.writeFormat=Text
while running flume command : flume-ng agent -n agent -c conf -f /home/swechchha/Documents/flumereal.conf
I am getting error
I am getting error to load JSON file.
Here is the code. It crashes at the line 110. Please make sure that flume user has access to that JSON file and that the file is correctly formatted.
The Flume.conf mentioned in Question Statement is having a problem.
TAILDIR SOURCE: Watch the specified files, and tail them in nearly real-time once detected new lines appended to each files. If the new lines are being written, this source will retry reading them in wait for the completion of the write.
While writing filegroups property directory may contain multiple files in this case it should be mentioned like directory path/ .filestype.
agent.sources.r1.filegroups.f1=/home/flume/Documents/spooldir/.*txt.*
Then run flume.conf and check the result it will work fine.

How to tail multiple files in fluentd

I have setup fluentd logger and I am able to monitor a file by using fluentd tail input plugin. All the data is received by fluentd is later published to elasticsearch cluster. Below is the configuration file for fluentd:
<source>
#type tail
path /home/user/Documents/log_data.json
format json
tag myfile
</source>
<match *myfile*>
#type elasticsearch
hosts 192.168.48.118:9200
user <username>
password <password>
index_name fluentd
type_name fluentd
</match>
As you can see I am monitoring log_data.json file by using tail. I also have a file in the same directory log_user.json, I want to monitor it also and publish it logs to elasticsearch. To do this, I thought of creating another <source> & <match> with different tag but it started showing errors.
How can I monitor multiple files in fluentd and publish them to elasticsearch. I see when we start fluentd its worker is started. Is it possible to start multiple worker so that each one of them is monitoring different files, or any other way of doing it. Can anyone point me to some good links/tutorials.
Thanks.
You can use multiple source+match tags.
Label can help you to bind them.
Here an example:
<source>
#label #mainstream
#type tail /home/user/Documents/log_data.json
format json
tag myfile
</source>
<label #mainstream>
<match **>
#type copy
<store>
#type elasticsearch
host elasticsearch
port 9200
logstash_format true
logstash_prefix fluentd
logstash_dateformat %Y%m%d
include_tag_key true
type_name access_log
tag_key #log_name
<buffer>
flush_mode interval
flush_interval 1s
retry_type exponential_backoff
flush_thread_count 2
retry_forever true
retry_max_interval 30
chunk_limit_size 2M
queue_limit_length 8
overflow_action block
</buffer>
</store>
</match>
</label>

How to store apache logs into mongodb using fluentd?

I am using "FLUENTD" data collector for storing apache logs into MongoDB.I did necessary changes in td-agent configuration files like
<source>
#type tail
format apache2
path C:\Program Files (x86)\Apache Group\Apache2\logs\access.log
tag mongo.apache
</source>
and
<match mongo.**>
# plugin type
#type mongo
# mongodb db + collection
database apache
collection access
# mongodb host + port
host localhost
port 27017
# interval
flush_interval 10s
# make sure to include the time key
include_time_key true
</match>
Did all changes in td-agent.conf file;
Path of log file is C:\Program Files (x86)\Apache Group\Apache2\logs\access
path of position file is C:\var\log\td-agent\apache2.access_log.pos
To test the configuration, pinged the apache server using command;
ab -n 100 -c 10 http://localhost/
This command is provided by http://docs.fluentd.org/v0.12/articles/apache-to-mongodb inorder to send logs to mongodb and I followed that tutorial to send logs to mongodb
Everything works good. Database and collection were also created,
but the logs files were not stored in MongoDB.
And also installed "Apache Group" to work with apache bench.

convert VMX to OVF using OVFtool

I am trying to convert VMX to OVF format using OVFTool as below, however it gives error:
C:\Program Files\VMware\VMware OVF Tool>ovftool.exe
vi://vcenter.com:port/folder/myfolder/abc.vmx abc.ovf
Error: Failed to open file: https://vcenter.com:port/folder/myfolder/abc.vmx
Completed with errors
Please let me know if you have any solution.
I had a similar situation in vmware fusion trying to use a .vmx that was probably created on windows. I could boot the VM, but any attempt to export the machine with ovftool or use vmware-vdiskmanager bombed out with:
Error: Failed to open disk: source.vmdk
Completed with errors
the diskname was totally valid, path was valid, permissions were valid, and the only clue was running ovftool with:
ovftool --X:logToConsole --X:logLevel=verbose source.vmx dest.ova
Opening VMX source: source.vmx
verbose -[10C2513C0] Opening source
verbose -[10C2513C0] Failed to open disk: ./source.vmdk
verbose -[10C2513C0] Exception: Failed to open disk: source.vmdk. Reason: Disk encoding error
Error: Failed to open disk: source.vmdk
as others suggested, i took a peek in the .vmdk. therein i found 3 other clues:
encoding="windows-1252"
createType="monolithicSparse"
# Extent description
RW 16777216 SPARSE "source.vmdk"
so first i converted the monolithicSparse vmdk to "preallocated virtual disk split in 2GB files":
vmware-vdiskmanager -r source.vmdk -t3 foo.vmdk
then i could edit the "foo.vmdk" to change the encoding, which now looks like:
encoding="utf-8"
createType="twoGbMaxExtentFlat"
# Extent description
RW 8323072 FLAT "foo-f001.vmdk" 0
RW 8323072 FLAT "foo-f002.vmdk" 0
RW 131072 FLAT "foo-f003.vmdk" 0
and finally, after fixing up the source.vmx:
scsi0:0.fileName = "foo.vmdk"
profit:
ovftool source.vmx dest.ova
...
Opening VMX source: source.vmx
Opening OVA target: dest.ova
Writing OVA package: dest.ova
Transfer Completed
Completed successfully
I had a similar problem with OVFTool trying to export to OVF format.
Export failed: Failed to open file: C:\Virtual\test\test.vmx.
First, I opened .VMX file in editor (it's a text file) and made sure that settings like
scsi0:0.fileName = "test.vmdk"
nvram = "test.nvram"
extendedConfigFile = "test.vmxf"
mention proper file names.
Then I noticed this line:
.encoding = "windows-1251"
This is Cyrillic code page, so I modified it to use Western code page
.encoding = "windows-1252"
Then, running OVFTool gave a different error
Export failed: Failed to open disk: test.vmdk.
To fix it I had to open .VMDK file in HEX editor (because it's usually a big binary file), found there the string
encoding = "windows-1251"
(it's somewhere in the beginning of the file), and replaced "1251" with "1252".
And it did the trick!
In my case, was needed repair the disk 'abc.vmdk' before convert the 'abc.vmx' to 'abc.ovf'.
Use this for Linux:
$ /usr/bin/vmware-vdiskmanager -R /home/user/VMware/abc.vmdk
Look this link https://kb.vmware.com/s/article/2019259 for resolved issue in Windows and Linux
Try to run as described below.
C:\Program Files\VMware\VMware OVF Tool>ovftool C:\Win-Test\Win-Test.vmx(location of your vmx file) C:\Win-Test\win-test.ovf (destination)
Maybe ovftool is unable to recognize the path you are giving.
Try with following command:
ovftool --eula#=[path to eula] --X:logToConsole --targetType=OVA --compress=9 vi://[username]:[ESX address] [target address]
Once you provide the ESX address, it will list down the folders you have created in your ESX box. Then you can trigger the command above mentioned again with appending folder name.
If no folder hierarchy present in your box, then it will simply list down vm names.
Retry the same command appending [foldername]/[vmname no vmx file name required]
ovftool --eula#=[path to eula] --X:logToConsole --targetType=OVA --compress=9 vi://[username]:[ESX address]/[foldername if exist]/[vmname no vmx file name required] [target address]
I had this same exact issue. In my case I opened up the VMX file and dropped the IDE and sound controllers from the file and saved. I was then able to convert everything to an OVA using the tool with the standard syntax.
e.g. I dropped:
ide1:0.present = "TRUE"
ide1:0.deviceType = "cdrom-image"
and:
sound.present = "TRUE"
sound.fileName = "-1"
sound.autodetect = "TRUE"
This allowed me to convert the file like normal.
For me opening the .vmx and deleting the following line worked:
sata0:1.deviceType = "cdrom-image"
In my case, this works:
ide1:0.present = "TRUE"
ide1:0.deviceType = "cdrom-image"
I did change true to false and works fine, as cdrom-image not exist, this change permit the format conversion.
if your goal is to move a windows based vm to virtual box you only need to:
uninstall vmware tools from the guest vm
shut down the machine
copy the hd to a new folder
create a new empty vm in virtualbox
mount the hd (the .vmdk file) in that vm
Easy and rapid to do.