I'm trying to get the status of a host with the CheckMK WebAPI. Can someone point me in the right direction how to get these data?
We're currently using CheckMK enterprise 1.4.0.
I've tried:
https://<monitoringhost.tld>/<site>/check_mk/webapi.py?action=get_host&_username=<user>&_secret=<secret>&output_format=json&effective_attributes=1&request={"hostname": "<hostname>"}
But the response does not have any relevant information about the host itself (e.g. state up/down, uptime, etc.).
{
"result": {
"attributes": {
"network_scan": {
"scan_interval": 86400,
"exclude_ranges": [],
"ip_ranges": [],
"run_as": "api"
},
"tag_agent": "cmk-agent",
"snmp_community": null,
"ipv6address": "",
"alias": "",
"management_protocol": null,
"site": "testjke",
"tag_address_family": "ip-v4-only",
"tag_criticality": "prod",
"contactgroups": [
true,
[]
],
"network_scan_result": {
"start": null,
"state": null,
"end": null,
"output": ""
},
"parents": [],
"management_address": "",
"tag_networking": "lan",
"ipaddress": "",
"management_snmp_community": null
},
"hostname": "<host>",
"path": ""
},
"result_code": 0
The webapi is only for getting/setting the configuration of a host or other objects. If you want't to get the live status of a host use livestatus.
If you enabled livestats on port 6557 (default) you could query the status of a host via network. If you are logged into a shell locally you can use 'lq'.
OMD[mysite]:~$ lq "GET hosts\nColumns: name"
Why:
The CheckMK webapi if for accessing WATO. WATO is the source for creating the nagios configuration. Nagios will do the monitoring of the hosts and the livestatus api is an extension of the nagios core.
http://<monitoringhost.tld>/<site>/check_mk/view.py?view_name=allhosts&output_format=csv
You can use all the views that you see in the webui by adding output_format=[csv|json|python].
You will the data of the table that you see.
You also need to add the creditals as seen in yout question.
Related
I am trying to delete an instance of longhorn, as well as the namespace, that is stuck in the terminating state.
I tried all three methods on the longhorn documentation, of which all fail. I cannot uninstall longhorn using a helm chart as I never installed longhorn through helm in the first place. Uninstalling longhorn using the kubectl also fails to create job.batch/longhorn-uninstall because the namespace longhorn-system is in the Terminating state.
Editing the CRDs and finalizers, as per the troubleshooting documentation and the following site (https://avasdream.engineer/kubernetes-longhorn-stuck-terminating) also do not fix the problem of the system terminating, as in both cases, there is no change. Using the script from https://github.com/longhorn/longhorn/blob/master/scripts/cleanup.sh also fails to terminate longhorn, as it fails to find any of the resources.
The query kubectl get namespace longhorn-system -o json gives the following results:
{
"apiVersion": "v1",
"kind": "Namespace",
"metadata": {
"creationTimestamp": "2022-10-31T20:09:45Z",
"deletionTimestamp": "2023-01-26T18:17:03Z",
"labels": {
"kubernetes.io/metadata.name": "longhorn-system",
"name": "longhorn-system"
},
"name": "longhorn-system",
"resourceVersion": "41929420",
"uid": "f1c78184-4613-4f9d-939d-13947ac8befa"
},
"spec": {
"finalizers": [
"kubernetes"
]
},
"status": {
"conditions": [
{
"lastTransitionTime": "2023-01-26T18:29:35Z",
"message": "All resources successfully discovered",
"reason": "ResourcesDiscovered",
"status": "False",
"type": "NamespaceDeletionDiscoveryFailure"
},
{
"lastTransitionTime": "2023-01-26T18:17:27Z",
"message": "All legacy kube types successfully parsed",
"reason": "ParsedGroupVersions",
"status": "False",
"type": "NamespaceDeletionGroupVersionParsingFailure"
},
{
"lastTransitionTime": "2023-01-26T18:17:51Z",
"message": "All content successfully deleted, may be waiting on finalization",
"reason": "ContentDeleted",
"status": "False",
"type": "NamespaceDeletionContentFailure"
},
{
"lastTransitionTime": "2023-01-26T18:17:27Z",
"message": "Some resources are remaining: engines.longhorn.io has 2 resource instances, nodes.longhorn.io has 5 resource instances, orphans.longhorn.io has 1 resource instances, replicas.longhorn.io has 4 resource instances, snapshots.longhorn.io has 3 resource instances, volumes.longhorn.io has 2 resource instances",
"reason": "SomeResourcesRemain",
"status": "True",
"type": "NamespaceContentRemaining"
},
{
"lastTransitionTime": "2023-01-26T18:17:27Z",
"message": "Some content in the namespace has finalizers remaining: longhorn.io in 17 resource instances",
"reason": "SomeFinalizersRemain",
"status": "True",
"type": "NamespaceFinalizersRemaining"
}
],
"phase": "Terminating"
}
}
The query kubectl api-resources --verbs=list --namespaced -o name | xargs -n 1 kubectl get --show-kind --ignore-not-found -n longhorn-system yields the following information.
NAME STATE NODE INSTANCEMANAGER IMAGE AGE
engine.longhorn.io/pvc-1886524a-5ba0-459d-9d51-b8044fec3057-e-344e3d26 stopped 64d
engine.longhorn.io/pvc-5aca06f9-adf1-45c8-b11d-6e79d9719001-e-79f24cb7 stopped 64d
NAME READY ALLOWSCHEDULING SCHEDULABLE AGE
node.longhorn.io/master0 False true True 86d
node.longhorn.io/master1 True true True 14d
node.longhorn.io/master2 False true True 86d
node.longhorn.io/worker0 False true True 70d
node.longhorn.io/worker1 True true True 70d
NAME TYPE NODE
orphan.longhorn.io/orphan-010ee0d16422c151e7e039e27fe2306815361596fa3f8b6cccc8a601b673e429 replica master0
NAME STATE NODE DISK INSTANCEMANAGER IMAGE AGE
replica.longhorn.io/pvc-1886524a-5ba0-459d-9d51-b8044fec3057-r-89dfabab stopped master2 c5a7e70d-09d8-43a2-9ba3-d5b65eb12b34 13d
replica.longhorn.io/pvc-1886524a-5ba0-459d-9d51-b8044fec3057-r-a6881548 running worker1 2d7f16e8-f11b-40e8-8935-7f0559f7674e instance-manager-r-8ccf914f longhornio/longhorn-engine:v1.3.2 16d
replica.longhorn.io/pvc-5aca06f9-adf1-45c8-b11d-6e79d9719001-r-52b5a290 stopped worker1 2d7f16e8-f11b-40e8-8935-7f0559f7674e 31d
replica.longhorn.io/pvc-5aca06f9-adf1-45c8-b11d-6e79d9719001-r-8f0ae6c9 running master2 c5a7e70d-09d8-43a2-9ba3-d5b65eb12b34 instance-manager-r-672003dc longhornio/longhorn-engine:v1.3.2 13d
NAME VOLUME CREATIONTIME READYTOUSE RESTORESIZE SIZE AGE
snapshot.longhorn.io/887f9621-5417-40b3-8999-c2695d5585d7 pvc-1886524a-5ba0-459d-9d51-b8044fec3057 2023-01-12T21:07:46Z false 10737418240 312860672 13d
snapshot.longhorn.io/8f11c48b-da51-4124-80b8-1316db88eb01 pvc-5aca06f9-adf1-45c8-b11d-6e79d9719001 2023-01-12T21:21:54Z false 21474836480 20096512000 13d
snapshot.longhorn.io/b4c31fe7-5ff5-4881-9cd8-b22fc73798bb pvc-5aca06f9-adf1-45c8-b11d-6e79d9719001 2023-01-12T21:38:23Z true 21474836480 102400 13d
NAME STATE ROBUSTNESS SCHEDULED SIZE NODE AGE
volume.longhorn.io/pvc-1886524a-5ba0-459d-9d51-b8044fec3057 attaching unknown 10737418240 master0 64d
volume.longhorn.io/pvc-5aca06f9-adf1-45c8-b11d-6e79d9719001 detaching unknown 21474836480 64d
Attempting to manually delete any of the items described also failed. All APIservices have Availability as TRUE.
What do I do to resolve this problem? I will provide any more information needed.
I have a nodejs application which runs on pm2 and I need to be able to send email notifications whenever a crash/ restart occurs. My idea is to monitor the application for crashes and trigger a mail action from pm2-health. The documentation of pm2-health module is here but I'm unable to use it for sending email alerts. Can anyone explain how to use it for this purpose?
P.S: Also, it would be great if you could explain about SMTP configuration for gmail.(I have configured postfix to use gmail smtp according to this and it works fine for test gmail but doesn't work with pm2-health)
This is how I could get pm2-health working with my Gmail account:
Install pm2-health module:
pm2 install pm2-health
Open PM2 module config file:
vim ~/.pm2/module_conf.json
Update it with the Gmail account’s SMTP parameters:
{
"pm2-health": {
"smtp": {
"host": "smtp.gmail.com",
"port": 465,
"user": "EXAMPLE_sender#gmail.com",
"password": "PASSWORD",
"secure": true,
"disabled": false
},
"mailTo": "NOTIFICATION_RECIPIENT_EMAIL_ADDRESS",
"replyTo": "EXAMPLE_SENDER#gmail.com",
"events": [
"exit"
],
"exceptions": true,
"messages": true,
"messageExcludeExps": [],
"metric": {},
"metricIntervalS": 60,
"aliveTimeoutS": 300,
"addLogs": false,
"appsExcluded": [],
"snapshot": {
"url": "",
"token": "",
"auth": {
"user": "",
"password": ""
},
"disabled": false
}
},
"module-db-v2": {
"pm2-health": {}
}
}
Save and close it
Restart pm2-health:
pm2 restart pm2-health
Test it by restarting one of your PM2-managed Node processes. You should receive an email about that event.
For anyone trying to use with 2FA enabled Gmail, you need to use an App Password. More information here: https://support.google.com/accounts/answer/185833
I have an ARM template that deploys API's to an API Management instance
Here is an example of one API
{
"properties": {
"authenticationSettings": {
"subscriptionKeyRequired": false
},
"subscriptionKeyParameterNames": {
"header": "Ocp-Apim-Subscription-Key",
"query": "subscription-key"
},
"apiRevision": "1",
"isCurrent": true,
"subscriptionRequired": true,
"displayName": "DDD.CRM.PostLeadRequest",
"serviceUrl": "https://test1/api/FuncCreateLead?code=XXXXXXXXXX",
"path": "CRMAPI/PostLeadRequest",
"protocols": [
"https"
]
},
"name": "[concat(variables('ApimServiceName'), '/mms-crm-postleadrequest')]",
"type": "Microsoft.ApiManagement/service/apis",
"apiVersion": "2019-01-01",
"dependsOn": []
}
When I am deploying this to different environments I would like to be able to substitute the service url depending on the environment. I'm wondering the best approach?
Can I read in a config file or something like that?
At the time of deployment I have a variable that tells me the environment so I can base decisions on that. Just not sure the best way to do it
See about ARM template parameters: https://learn.microsoft.com/en-us/azure/azure-resource-manager/resource-group-authoring-templates#parameters They can be specified in a separate file. So you will have single template, but environment specific parameter files.
I'm trying to create alerts for my cosmosdb account using arm template, the cosmosdb is already created, so Im not able use dependsOn to refer the rosurce.
"resources": [
{
"type": "microsoft.insights/alertrules",
"name": "[parameters('alertrules_alert_name')]",
"apiVersion": "2014-04-01",
"location": "southcentralus",
"scale": null,
"properties": {
"name": "[parameters('alertrules_alert_name')]",
"description": null,
"isEnabled": true,
"condition": {
"odata.type": "Microsoft.Azure.Management.Insights.Models.ThresholdRuleCondition",
"dataSource": {
"odata.type": "Microsoft.Azure.Management.Insights.Models.RuleMetricDataSource",
"resourceUri": "[resourceId('Microsoft.DocumentDB/databaseAccounts', parameters('databaseAccounts_cosmosaccount_name_1'))]",
"metricNamespace": null,
"metricName": "Http 401"
},
"operator": "GreaterThan",
"threshold": 1,
"windowSize": "PT30M"
},
"action": null
}
}
],
"outputs": {}
}
Please review the following documentation to enable (Classic) Alerts and Diagnostic Settings via ARM template when creating a NEW Cosmos DB resource.
1) Create a classic metric alert with a Resource Manager template
2) Automatically enable Diagnostic Settings at resource creation using a Resource Manager template
3) Azure Cosmos DB diagnostic logging
Please Up Vote the existing entries for ARM templete functionality or create a new User Voice entry that is specific to your use case: Azure Cosmos DB User Voice
I'm reading JSON data from an ARC Server report online and trying to create a database with the data.
I've created the database named: test.db
I need the columns to be identified as "Service", "Folder", "Service URL", "Configured State", "Real Time State", "Server Type".
and the rows as each "Service" returned from the report.
The JSON data looks like this:
{"reports": [{
"folderName": "/",
"serviceName": "SampleWorldCities",
"type": "MapServer",
"description": "The SampleWorldCities service is provided so you can quickly and easily preview the functionality of the GIS server. Click the thumbnail image to open in a web application. This sample service is optional and can be deleted.",
"isDefault": false,
"isPrivate": false,
"hasManifest": false,
"status": {
"configuredState": "STARTED",
"realTimeState": "STARTED"
},
"instances": {
"folderName": "/",
"serviceName": "SampleWorldCities",
"type": "MapServer",
"max": 1,
"busy": 0,
"free": 1,
"initializing": 0,
"notCreated": 0,
"transactions": 72,
"totalBusyTime": 127611,
"isStatisticsAvailable": true
},
"properties": {
"maxRecordCount": "1000",
"filePath": "${AGSSERVER}/framework/etc/data/WorldCities/WorldCities.msd",
"cacheOnDemand": "false",
"useLocalCacheDir": "true",
"outputDir": "/home/ec2-user/arcgis/server/usr/directories/arcgisoutput",
"virtualOutputDir": "/rest/directories/arcgisoutput",
"supportedImageReturnTypes": "MIME+URL",
"minScale": "295000000",
"isCached": "false",
"ignoreCache": "false",
"maxScale": "4000",
"clientCachingAllowed": "true",
"cacheDir": "/home/ec2-user/arcgis/server/usr/directories/arcgiscache"
},
"iteminfo": {
"description": "The SampleWorldCities service is provided so you can quickly and easily preview the functionality of the GIS server. Click the thumbnail image to open in a web application. This sample service is optional and can be deleted.",
"summary": "The SampleWorldCities service is provided so you can quickly and easily preview the functionality of the GIS server. Click the thumbnail image to open in a web application. This sample service is optional and can be deleted.",
"tags": [
"sample",
"map",
"service"
],
"thumbnail": "thumbnail.png"
},
"permissions": [{
"principal": "esriEveryone",
"permission": {"isAllowed": true},
"childURL": null,
"operation": null
}]
}]}
my sript is as follows:
import json
import sqlite3
db = sqlite3.connect('test.db')
traffic = json_read
c = db.cursor()
someitem = traffic.itervalues().next()
columns = ['Service', 'Folder', 'Service URL', 'Configured State', 'Real Time State', 'Server Type']
c.execute("SELECT sql FROM sqlite_master WHERE " \
"Service='Services' AND type = 'table'")
create_table_string = cursor.fetchall()[0][0]
c.execute('''create table Services
(Service text primary key,
Folder text,
Service URL text,
Configured State text,
Real Time State text,
Server Type text)''')
for service, data in traffic.iteritems():
services = (service,) + tuple(data[c] for c in columns)
c = db.cursor()
c.execute(query)
c.close()
print "JSON Complete"
Can someone point me in the right direction?
Forgot to mention
Service is Service Name,
Folder is folder name,
service url is link to the service,
configured state is configuredState,
realtime state is realTimeState,
Server type is type
db = sqlite3.connect(
'server.db')
cursor = db.cursor()
cursor.execute("DROP TABLE if exists Services")
db.commit()
cursor.execute("DROP TABLE if exists Services2")
db.commit()
cursor.execute('''CREATE TABLE Services
(Service text,
Folder text,
Service_URL text,
Configured_State text,
Real_Time_State text,
Server text);''')
This was the code that gave me the output I was looking for.