I have IP addresses stored in a MySQL database and as I'm learning to run queries, one thought I had was to see if I could find entries that share a common network. To do this, I "broaden" my search by removing the last group of numbers from an IPv4.
So I'll do a LIKE search for an IP of XXX.XX.XXX. and see what comes up.
I have two questions:
Is what I'm doing stupid? Like, does trimming off the last group even achieve what I think I'm doing?
If not, how would I "broaden" a search for IPv6?
I've done some research on the groupings of IPv6, but I'm afraid I'm still not understanding what each grouping does. Admittedly I have only cursory knowledge of the anatomy of even IPv4.
Again, this isn't really a serious issue, but I couldn't find any information about something like this (which is what leads me to believe that I may not even be doing what I think I am doing)...
Is what I'm doing stupid? Like, does trimming off the last group even achieve what I think I'm doing?
To answer that first requires some definition of what it means to "share a common network". After all, the Internet itself is a "common network" that is "shared" by every device connected to it!
Network engineers tend to talk in terms of "domains": collision domains, broadcast domains, administrative domains, etc. At a guess, you might be thinking in terms of administrative domain.
For a short time in history, a chunk of the IP address space (known as "Class C" addresses) were assigned such that the first three octets indicated the network number, and therefore "trimming off" the final octet would (for that chunk of the IP address space) have behaved exactly as you describe.
However, since the introduction of CIDR, it no longer works like that: addresses can now be assigned in any (2-exponent) size of block and you cannot determine the size of the allocated block from the address alone. Moreover, blocks can be repeatedly subdivided and reallocated.
TL;DR—it's not clear what you think you're doing but in any event trimming off the last group probably doesn't achieve it.
Now, if you only want to determine whether a given address is on your network you can do so by performing a bitwise AND between it and your network mask; then comparing the result with the bitwise AND of your address and your network mask.
Beyond that, you'd need to lookup properties of the address in external datasets...
Related
I’d like to query an IPFS network for the uptime of its nodes individually. That is, over a duration d I would like to know roughly how much time a node has been participating in the network. Instead of time, I believe it’s also safe to frame this problem in terms of work and query for the number of blocks added within a time period.
Is there a way to do this?
crosspost
The short answer is not really. See the response on discuss.ipfs.io for more detail.
If you are fine with a trustful system, you can potentially expose block add data through the RPC. But, this isn't supported out of the box.
Otherwise, you'll have to resort to some kind of polling and crawling it seems.
sorry for the lengthy post, I have a question about common crpytographic hashing algorithms, such as the SHA family, MD5, etc.
In general, such a hash algorithm cannot be injective, since the actual digest produced has usually a fixed length (e.g., 160 bits under SHA-1, etc.) whereas the space of possible messages to be digested is virtually infinite.
However, if we generate a digest of a message which is at most as long as the digest generated, what are the properties of the commonly used hashing algorithms? Are they likely to be injective on this limited message space? Are there algorithms, which are known to produce collisions even on messages whose bit lengths is shorter than the bit length of the digest produced?
I am actually looking for an algorithm, which has this property, i.e., which, at least in principle, may generate colliding hashes even for short input messages.
Background: We have a browser plug-in, which for every web-site visited, makes a server request asking, whether the web-site belongs to one of our known partners. But of course, we don't want to spy on our users. So, in order to make it hard to generate some kind of surf history, we do not actually send the URL visited, but a hash digest (currently, SHA-1) of some cleaned up version. On the server side, we have a table of hashes of well-known URIs, which is matched against the received hash. We can live with a certain amount of uncertainty here, since we consider not being able to track our users a feature, not a bug.
For obvious reasons, this scheme is pretty fuzzy and admits false positives as well as URIs not matched which should have.
So right now, we are considering changing the fingerprint generation to something, which has more structure, for example, instead of hashing the full (cleaned up) URI, we might instead:
split the host name into components at "." and hash those individually
check the path into components at "." and hash those individually
Join the resulting hashes into a fingerprint value. Example: hashing "www.apple.com/de/shop" using this scheme (and using Adler 32 as hash) might yield "46989670.104268307.41353536/19857610/73204162".
However, as such a fingerprint has a lot of structure (in particular, when compared to a plain SHA-1 digest), we might accidentally make it pretty easy again to compute actual URI visited by a user (for example, by using a pre-computed table of hash values for "common" compont values, such as "www").
So right now, I am looking for a hash/digest algorithm, which has a high rate of collisions (Adler 32 is seriously considered) even on short messages, so that the probability of a given component hash being unique is low. We hope, that the additional structure we impose provides us with enough additional information to as to improve the matching behaviour (i.e., lower the rate of false positives/false negatives).
I do not believe hashes are guaranteed to be injective for messages the same size the digest. If they were, they would be bijective, which would be missing the point of a hash. This suggests that they are not injective for messages smaller than the digest either.
If you want to encourage collisions, i suggest you use any hash function you like, then throw away bits until it collides enough.
For example, throwing away 159 bits of a SHA-1 hash will give you a pretty high collision rate. You might not want to throw that many away.
However, what you are trying to achieve seems inherently dubious. You want to be able to tell that the URL is one of your ones, but not which one it is. That means you want your URLs to collide with each other, but not with URLs that are not yours. A hash function won't do that for you. Rather, because collisions will be random, since there are many more URLs which are not yours than which are (i assume!), any given level of collision will lead to dramatically more confusion over whether a URL is one of yours or not than over which of yours it is.
Instead, how about sending the list of URLs to the plugin at startup, and then having it just send back a single bit indicating if it's visiting a URL in the list? If you don't want to send the URLs explicitly, send hashes (without trying to maximise collisions). If you want to save space, send a Bloom filter.
Since you're willing to accept a rate of false positives (that is, random sites identified as whitelisted when in fact they are not), a Bloom filter might be just the thing.
Each client downloads a Bloom filter containing the whole whitelist. Then the client has no need to otherwise communicate with the server, and there is no risk of spying.
At 2 bytes per URL, the false positive rate would be below 0.1%, and at 4 bytes per URL below 1 in 4 million.
Downloading the whole filter (and perhaps regular updates to it) is a large investment of bandwidth up front. But supposing that it has a million URLs on it (which seems quite a lot to me, given that you can probably apply some rules to canonicalize URLs before lookup), it's a 4MB download. Compare this with a list of a million 32 bit hashes: same size, but the false positive rate would be somewhere around 1 in 4 thousand, so the Bloom filter wins for compactness.
I don't know how the plugin contacts the server, but I doubt that you can get an HTTP transaction done in much under 1kB -- perhaps less with keep-alive connections. If filter updates are less frequent than one per 4k URL visits by a given user (or a smaller number if there are less than a million URLs, or greater than 1 in 4 million false positive probability), this has a chance of using less bandwidth than the current scheme, and of course leaks much less information about the user.
It doesn't work quite as well if you require new URLs to be whitelisted immediately, although I suppose the client could still hit the server at every page request, to check whether the filter has changed and if so download an update patch.
Even if the Bloom filter is too big to download in full (perhaps for cases where the client has no persistent storage and RAM is limited), then I think you could still introduce some information-hiding by having the client compute which bits of the Bloom filter it needs to see, and asking for those from the server. With a combination of caching in the client (the higher the proportion of the filter you have cached, the fewer bits you need to ask for and hence the less you tell the server), asking for a window around the actual bit you care about (so you don't tell the server exactly which bit you need), and the client asking for spurious bits it doesn't really need (hide the information in noise), I expect you could obfuscate what URLs you're visiting. It would take some analysis to prove how much that actually works, though: a spy would be aiming to find a pattern in your requests that's correlated with browsing a particular site.
I'm under the impression that you actually want public key cryptography, where you provide the visitor with a public key used to encode the URL, and you decrypt the URL using the secret key.
There are JavaScript implementations a bit everywhere.
I will start off with saying I know that it is impossible to prevent your software from reverse engineering.
But, when I take a look at crackmes.de, there are crackmes with a difficulty grade of 8 and 9 (on a scale of 1 to 10). These crackmes are getting cracked by genius brains, who write a tutorial on how to crack it. Some times, such tutorials are 13+ pages long!
When I try to make a crackme, they crack it in 10 minutes. Followed by a "how-to-crack" tutorial with a length of 20 lines.
So the questions are:
How can I make a relatively good anti-crack protection.
Which techniques should I use?
How can I learn it?
...
Disclaimer: I work for a software-protection tools vendor (Wibu-Systems).
Stopping cracking is all we do and all we have done since 1989. So we thoroughly understand how SW gets cracked and how to avoid it. Bottom line: only with a secure hardware dongle, implemented correctly, can you guarantee against cracking.
Most strong anti-cracking relies on encryption (symmetric or public key). The encryption can be very strong, but unless the key storage/generation is equally strong it can be attacked. Lots of other methods are possible too, even with good encryption, unless you know what you are doing. A software-only solution will have to store the key in an accessible place, easily found or vulnerable to a man-in-the-middle attack. Same thing is true with keys stored on a web server. Even with good encryption and secure key storage, unless you can detect debuggers the cracker can just take a snapshot of memory and build an exe from that. So you need to never completely decrypt in memory at any one time and have some code for debugger detection. Obfuscation, dead code, etc, won't slow them down for long because they don't crack by starting at the beginning and working through your code. They are far more clever than that. Just look at some of the how-to cracking videos on the net to see how to find the security detection code and crack from there.
Brief shameless promotion: Our hardware system has NEVER been cracked. We have one major client who uses it solely for anti-reverse engineering. So we know it can be done.
Languages like Java and C# are too high-level and do not provide any effective structures against cracking. You could make it hard for script kiddies through obfuscation, but if your product is worth it it will be broken anyway.
I would turn this round slightly and think about:
(1) putting in place simple(ish) measures so that your program isn't trivial to hack, so e.g. in Java:
obfuscate your code so at least make your enemy have to go to the moderate hassle of looking through a decompilation of obfuscated code
maybe write a custom class loader to load some classes encrypted in a custom format
look at what information your classes HAVE to expose (e.g. subclass/interface information can't be obfuscated away) and think about ways round that
put some small key functionality in a DLL/format less easy to disassemble
However, the more effort you go to, the more serious hackers will see it as a "challenge". You really just want to make sure that, say, an average 1st year computer science degree student can't hack your program in a few hours.
(2) putting more subtle copyright/authorship markers (e.g. metadata in images, maybe subtly embed a popup that will appear in 1 year's time to all copies that don't connect and authenticate with your server...) that hackers might not bother to look for/disable because their hacked program "works" as it is.
(3) just give your program away in countries where you don't realistically have a chance of making a profit from it and don't worry about it too much-- if anything, it's a form of viral marketing. Remember that in many countries, what we see in the UK/US as "piracy" of our Precious Things is openly tolerated by government/law enforcement; don't base your business model around copyright enforcement that doesn't exist.
I have a pretty popular app (which i won't specify here, to avoid crackers' curiosity, of course) and suffered with cracked versions some times in the past, fact that really caused me many headaches.
After months struggling with lots of anti-cracking techniques, since 2009 i could establish a method that proved to be effective, at least in my case : my app has not been cracked since then.
My method consists in using a combination of three implementations :
1 - Lots of checks in the source code (size, CRC, date and so on : use your creativity. For instance, if my app detects tools like OllyDbg being executed, it will force the machine to shutdown)
2 - CodeVirtualizer virutalization in sensitive functions in source code
3 - EXE encryption
None of these are really effective alone : checks can be passed by a debugger, virtualization can be reversed and EXE encryption can be decrypted.
But when you used altogether, they will cause BIG pain to any cracker.
It's not perfect although : so many checks makes the app slower and the EXE encrypt can lead to false positive in some anti-virus software.
Even so there is nothing like not be cracked ;)
Good luck.
Personaly I am fan of server side check.
It can be as simple as authentication of application or user each time it runs. However that can be easly cracked. Or puting some part of code to server side and that would requere a lot more work.
However your program will requere internet connection as must have and you will have expenses for server. But that the only way to make it relatively good protected. Any stand alone application will be cracked relatively fast.
More logic you will move to server side more hard to crack it will get. But it will if it will be worth it. Even large companies like Blizzrd can't prevent theyr server side being reversed engineered.
I purpose the following:
Create in home a key named KEY1 with N bytes randomly.
Sell the user a "License number" with the Software. Take note of his/her name and surname and tell him/her that those data are required to activate the Software, also an Internet conection.
Upload within the next 24 hours to your server the "License number", and the name and surname, also the KEY3 = (KEY1 XOR hash_N_bytes(License_number, name and surname) )
The installer asks for a "Licese_number" and the name and surname, then it sends those data to the server and downloads the key named "KEY3" if those data correspond to a valid sell.
Then the installer makes KEY1 = KEY3 XOR hash_N_bytes(License_number, name and surname)
The installer checks KEY1 using a "Hash" of 16 bits. The application is encrypted with the KEY1 key. Then it decrypts the application with the key and it's ready.
Both the installer and application must have a CRC content check.
Both could check is being debugged.
Both could have encrypted parts of code during execution time.
What do you think about this method?
We have a service where we literally give away free money.
Naturally said service is ripe for abuse. To defend against this we do the following:
log ip address
use unique email addresses (only 1 acct/email addy)
collect more info like st. address, phone number, etc.
use signup captcha
BHOs (I've seen poker rooms use these)
Now, let's get real here -- NONE of this will stop a determined user.
Obviously ip addresses can be changed via a proxy (which could be blacklisted via akismet) but change anyways if the user has a dynamic ip or if more than one user is behind a NAT'd network (can we say almost everyone?)
I can sign up for thousands of unique email addresses each hour -- this is no defense.
I can put in fake information taken from lists for street addresses and phone numbers.
I can buy captchas from captcha solving services (1k for $5).
bhos seem only effective for downloadable software -- this is a website
What are some other ways to prevent multiple users from abusing the service? How do all the PPC people control click fraud?
I know we could actually call the person but I don't think we are trying to do that anytime soon.
Thanks,
It's pretty difficult to generate lots of fake phone numbers that can send and receive SMS messages. SMS verification could go a long way towards cutting down on fraud. Of course, it also limits you to giving away free money to cell phone owners.
I think only way is to bind your users accounts to 'real world' information, like his/her passport number, for instance. Of course, you'll need to make sure that information is securely stored and to find some way to validate it.
Re: signing up for new email accounts...
A user doesn't even need to do that. Please feel free to send your mail to brian_s#mailinator.com, or feydr.asks.a.question#spamherelots.com, or stackoverflow#safetymail.info, or my_arbitrary_username#zippymail.info. I haven't registered any of those email addresses, but all of them will work.
Those domains are owned by ManyBrain, and they (and probably others as well) set the domain to accept any email user. ManyBrain in particular then makes the inboxes for those emails publicly accessible without any registration (stripping everything by text from the email and deleting old mail). Check it out: admin#mailinator.com's email inbox!
Others have mentioned ways to try and keep user identities unique. This is just one more reason to not trust email addresses.
First, I suppose (hope) that you don't literally give away free money but rather give it to use your service or something like that.
That matters as there is a big difference between users trying to just get free money from you they can spend on buying expensive cars vs only spending on your service which would be much more limited.
Obviously many more user will try to fool the system in the former than in the latter case.
Why it matters? Because it is all about the balance between your control vs your user annoyance. I see many answers concentrating on the control part, so let's go through annoyance, shall we?
Log IP address. What if I am the next guy on the computer in say internet shop and the guy before me already used that IP? The other guy left your hot page that I now see but I am screwed because the IP is blocked. Yes, I can go to another computer but it is annoyance and I may have other things to do.
Collecting physical Adresses. For what??? Are you going to visit me? Or start sending me spam letters? Let me guess, more often than not you get addresses with misprints at best and fake ones at worst. In fact, it is much less hassle for me to give you fake address and not dealing with whatever possible spam letters I'll have to recycle in environment-friendly way. :)
Collecting phone numbers. Again, why shall I trust your site? This is the real story. I gave my phone nr to obscure site, then later I started receiving occasional messages full of nonsense like "hit the fly". That I simply deleted. Only later and by accident to discover that I was actually charged 2 euros to receive each of those messages!!! Do I want to get those hassles? Obviously not! So no, buddy, sorry to disappoint but I will not give your site my phone number unless your company is called Facebook or Google. :)
Use signup captcha. I love that :). So what are we trying to achieve here? Will the user who is determined to abuse your service, have problems to type in a couple of captchas? I doubt it. But what about the "good user"? Are you aware how annoying captchas are for many users??? What about users with impaired vision? But even without it, most captchas are so bad that they make you feel like you have impaired vision! The best advice I can give - if you care about user experience, avoid captchas as plague! If you have any doubts, do your online research first!
See here more discussion about control vs annoyance and here some more thoughts about being user-friendly.
You have to bind their information to something that is 'real world', as Rubens says. Of course, you also need to be able to verify this information (I can just make up passport numbers all day if you don't check to make sure they're correct).
How do you deliver the money? Perhaps you can index this off the paypal account, mailing address, or whatever you're sending the money to?
Sometimes the only way to prevent people abusing a system is to not have the system in the first place.
If you're doing what you say you're doing, "giving away money to people", then surprise surprise, there will be tons of people with more time available to try to find ways to game the system than you will have to fix it.
I guess it will never be possible to have an identification system which identifies fake identities that is:
cheap to run (I think it's called "operational cost"?)
cheap to implement (ideally one time cost - how do you call that?)
has no Type-I/Type-II errors
is scalable
But I think you could prevent users from having too many (to say a quite random number: more than 50) accounts.
You might combine the following approaches:
IP address: can be bypassed with VPN
CAPTCHA: can be bypassed with human farms (see this article, for example - although they claim that their test can't be that easily passed to other humans, I doubt this is true)
Ability-based identification: can be faked when you know what is stored and how exactly the identification works by randomly (but with a given distribution) acting (example: brainauth.com)
Real-world interaction: Although this might be the best one, but I guess it is expensive and not many users will accept it. Also, for some users/countries it might not be possible. (example: Postident in Germany, where the Post wants to see your identity card. I guess this can only be faced in massive scale by the government.)
Other sites/resources: This basically transforms the problem for other sites. You can use services, where it is not allowed/uncommon/expensive to have much more than 1 account
Email
Phone number: e.g. by using SMS, see Multi-factor authentication
Bank account: PayPal; transfer not much money or ask them to transfer a random (small) amount to you (which you will send back).
Social based
When you take the social graph (vertices are people, edges are connections), you will expect some distribution. You know that you are a single human and you know some other people. So you have a "network of trust" (in quotes, because I think this might be used in other context as well). Now you might not trust people / networks how interact heavily with your service, but are either isolated (no connection) or who connect a large group with another large group ("articulation points"). You also might not trust fast growing, heavily interacting new, isolated graphs.
When a user provides content that is liked by many other users (who you trust), this might be an indicator that there is a real human creating it.
We had a similar issue recently on our website, it is really a hassle to solve this issue if you are providing a business over one time or monthly recurring free credits system.
We are using a fraud detection solution https://fraudradar.io for a while and that helped us a lot to clean out most of the spam activities. It is pretty customizable with:
IP checks
Email domain validity
Regex rules
Whitelisting options per IP, email domain etc.
Simple API to communicate through
I would suggest to check that out.
How do I go about reverse engineering a UDP-based custom game protocol with nothing other than Wireshark? I can log a bunch of traffic, but then what? My goal is to write a dissector plugin for Wireshark that will eventually be able to decode the game commands. Does this seem feasible? What challenges might I face? Is it possible the commands are encrypted?
Yeah, it's feasible. But how practical it is will depend on the game in question. Compression will make your job harder, and encryption will make it impossible (at least through Wireshark - you can still get at the data in memory).
Probably the best way to go about this is to do it methodically - don't log 'a bunch of traffic' but instead perform a single action or command within the game and see what data is sent out to communicate that. Then you can look at the packet and try to spot anything of interest. Usually you won't learn much from that, so try another command and compare the new message with the first one. Which parts are in the same place? Which parts have moved? And which parts have changed entirely? Look especially for a value in a fixed position near the start of the packet that could be describing the message type. Generally speaking the start of the packet will be the generic stuff like the header and later parts of the packet will be the message-specifics. Consider that a UDP protocol often has its own hand-rolled ordering or reliability scheme and that you might find sequence numbers in there near the start.
Knowing your data types is handy. Integer values might be stored in big-endian or little-endian format, for example. And many games send data as floating point values, so be on the look-out for 2 or 3 floats in a row that might be describing a position or velocity.
Commercial games expect that people will try to hack the protocol as a means to cheat, so will generally use encryption and probably tamper-detection as well.
Stopping this type of activity is of great concern to game makers because it ruins the experience for the majority of players when a few players have super-tools. For games like online poker the consequences are even more severe.