how to delete common data in two lookups in splunk and gain unique data in the table - duplicates

i have 2 lookup
look1 ip host status type
look 2 ip host
please help me for
how to delete common data in two lookups in splunk and gain unique data in the table
thank you
| inputlookup Misili_OA_Daily
|search NOT [|inputlookup Misili_OA_Changes
|format]

That query looks close. How is it failing you?
The key to using a subsearch is to make sure it returns fields present in the main search - otherwise, you'll get no results.
Assuming Misili_OA_Daily is look1 and Misili_OA_Changes is look2 then this may help.
| inputlookup Misili_OA_Daily where NOT [|inputlookup Misili_OA_Changes
| fields ip host | format ]

Related

MQTT sessions in database - how to make performant?

I have an application that let's devices communicate over MQTT.
When two (or more devices) are paired, they are in a session (with a session-id)
The topics are for example:
session/<session-id>/<sender-id>/phase
with a payload like
{'phase': 'start', 'othervars': 'examplevar'}
Every session is logged into a mySQL database into the following format:
| id | session-id | sender | topic (example: phase) | payload | entry-time | ...
Now, when I just want to get a whole session I can just query by session-id.
Another view I want to achieve looks like this:
| session-id (distinct) | begin time | end time | duration | success |
Success is a boolean; true when in the current session there is an entry where the payload has a 'phase':'success'. Otherwise it is not successful.
Now I have the problem that this query is very slow. Everytime I want to access it, it has to calculate for each session if it was successful, along with the time calculation.
Should I make a script at the end of a session, to calculate this information and put it in another table? The problem I have with this solution is that I will have duplicate data.
Can I make this faster with indexes? Or did I just make a huge design mistake from the beginning?
Thanks in advance
Indexes? Yes. YES!
If session-id is unique, get rid of id and use PRIMARY KEY(session_id).
success could be TINYINT NOT NULL with values 0 or 1 for fail or success.
If the "payload" is coming in as JSON, then I suggest storing the entire string in a column for future reference, plus pull out any columns that you need to search on and index them. In later versions of MySQL, there is a JSON datatype, which could be useful.
Please provide some SELECTs so we can further advise.
Oh, did I mention how important indexes are to databases?

Should I store users IP addresses in a separate table?

I am building a new website that will store IP addresses in multiple tables like users, login_history, payments and more.
I am wondering if I should add an ip column in each table and store the actual ip, or I should create a separate table named ip_addresses and store the ip identifier in the columns.
Method 1:
users:
username | ip
jon_snow | 134744072
Method 2:
users:
username | ip
jon_snow | 1
--
ip_addresses:
id | ip
1 | 134744072
Since IP addresses will change for most of the users over time, the best place to store them is perhaps in the login_history table. This way you can associate the IP addresses with the users and their sessions.
Of course, if you want to restrict user access based on IP address and you rquire your users to use the same IP over the time, then store it in the users table.
IPv4 addresses are meaningfully formatted 32-bit integers. IPv6 ones are meaningfully formatted again, but much larger. Either way, you'd be creating a 1:1 mapping of dense data. Unless you need to do it for other reasons, I would not normally choose to normalise them into another table. You're unlikely to gain speed or save space, unless your users have a very restricted set of IPs.
The used of inet(6)_aton will pack string representations, and the _ntoa version will unpack them efficiently, so you can use meaningful strings and store efficient binary versions.

How do I tell what is in solr index?

I'm trying to index a very small (6 rows) table into solr, and it says that it's added/updated 6 documents, but it doesn't return anything when I search for a field. My table is as follows
League:
field | type |
---------------------
id | int |
leaguename | string|
and here is what solr prints when I try to do the full-import
03data-config.xmlfull-importidle1602011-07-13 19:11:42Indexing completed. Added/Updated: 6 documents. Deleted 0 documents.
2011-07-13 19:11:422011-07-13 19:11:4260:0:0.120
This response format is experimental. It is likely to change in the future.
Is there a way that I can view the values that index is holding? I've tried looking in the data folder in solr, but all the files just seem to have strange non-alphanumeric characters in them.
Luke is a desktop app which allows you to examine an index, run queries and generally muck around.
If the index is remote you will first need to transfer it to your desktop, then just open it in Luke.
http://code.google.com/p/luke/
Luke rocks!
assuming you ar erunning solr on localhost and port 8983 as per standard example
you can do a wildcard query like
http://localhost:8983/solr/select?q=*:*
This will return all the documents with all stored fields.
From the admin screen query a wildcard on whatever your unique key field is.
Uniquekeyfieldname:*
That will get you a count to see if something got indexed. If you want to see all fields too then specify the field list at the end of the query strong
&FL=*

Storing array with values in database

I have the following data which I want to save in my DB (this is used for sending text messages via a 3rd party API)
text_id, text_message, text_time, (array)text_contacts
text_contacts contains a normal array with all the contact_id's
How should I properly store the data in a MySQL database?
I was thinking myself either on 2 ways:
Make the array with contact_id's in a json_encoded (no need for serializing since it's not multi-dimensional) string, and store it in a text field in the DB
Make a second table with the text_id and all contact_id's on a new row..
note: The data stored in the text_contacts array does not need to be changed at any time.
note2: The data is used as individual contact_id to get the phone number from the contact, and check whether the text message has actually been sent.. (with a combination of text_id, and phonenumber)
What is more efficiƫnt, and why?
This is completely dependent upon your expected usage characteristics. If you will have a near-term need to query based upon the contact_ids, then store them independently as in your second solution. If you're storing them for archival purposes, and don't expect them to be used dynamically, you're as well off saving the time and storing them in a JSON string. It's all about the usage.
IMO, go with the second table, mapping text-ids to contact-ids. Will be easier to manipulate than storing all the contacts in one field
This topic will bring in quite a few opinions, but my belief: second table, by all means.
If you ever have a case where you actually need to search by that data, it will not require you to parse it before using it.
It is a heck of a lot easier to debug (for the same reason)
json_encode and json_decode (or equivalent) take far more time than a join does.
Lazy loading is easier, even if not necessary in most cases.
Others will find it more readable and, with a good schema definition, easier to conceptualize and maintain.
Almost all implementations would use one table for storing each text_contacts, and then a second table would use a foreign key to reference the text_contacts table. So, if say you had a table text_contacts that looked like this:
contact_id | name
1 | someone
2 | someone_else
And a text message table that looked like this:
text_id | text_message | text_time | text_contact
1 | "Hey" | 12:48 | 1
2 | "Hey" | 12:48 | 2
Each contact that has been sent a message would have a new entry in the text message table, with the last column referencing the contact_id field of the text_contacts table. This way makes it much easier to retrieve messages by contact, because you can say "select * from text_messages where text_contact = 1" instead of searching through each of the arrays on the single table to find the messages sent by a specific user.

System for tracking changes in whois records

What's the best storage mechanism (from the view of the database to be used and system for storing all the records) for a system built to track whois record changes? The program will be run once a day and a track should be kept of what the previous value was and what the new value is.
Suggestions on database and thoughts on how to store the different records/fields so that data is not redundant/duplicated
(Added) My thoughts on one mechanism to store data
Example case showing sale of one domain "sample.com" from personA to personB on 1/1/2010
Table_DomainNames
DomainId | DomainName
1 example.com
2 sample.com
Table_ChangeTrack
DomainId | DateTime | RegistrarId | RegistrantId | (others)
2 1/1/2009 1 1
2 1/1/2010 2 2
Table_Registrars
RegistrarId | RegistrarName
1 GoDaddy
2 1&1
Table_Registrants
RegistrantId | RegistrantName
1 PersonA
2 PersonB
All tables are "append-only". Does this model make sense? Table_ChangeTrack should be "added to" only when there is any change in ANY of the monitored fields.
Is there any way of making this more efficient / tighter from the size point-of-view??
The primary data is the existence or changes to the whois records. This suggests that your primary table be:
<id, domain, effective_date, detail_id>
where the detail_id points to actual whois data, likely normalized itself:
<detail_id, registrar_id, admin_id, tech_id, ...>
But do note that most registrars consider the information their property (whether it is or not) and have warnings like:
TERMS OF USE: You are not authorized
to access or query our Whois database
through the use of electronic
processes that are high-volume and
automated except as reasonably
necessary to register domain names or
modify existing registrations...
From which you can expect that they'll cut you off if you read their databases too much.
You could
store the checksum of a normalized form of the whois record data fields for comparison.
store the original and current version of the data (possibly in compressed form), if required.
store diffs of each detected change (possibly in compressed form), if required.
It is much like how incremental backup systems work. Maybe you can get further inspiration from there.
you can write vbscript in an excel file to go out and query a webpage (in this case, the particular 'whois' url for a specific site) and then store the results back to a worksheet in excel.