Mifare PlusX 4K: is there a consistent but unique part of the key that can be used for identification without authentication? - identity

I am trying to write a PC application (Windows, .NET) that identifies students on the basis of some card equipped with RFID identification to build lecture attendance registers. Currently I have a Stronglink SL040A RFID reader (http://www.stronglink-rfid.com/en/rfid-modules/sl040.html), which operates as a HID and sends the data as a series of keystrokes.
The system works perfectly with older cards like Mifare 1K classic (even with PayPass credit cards). The new student cards (and identity cards) issued by the Hungarian authorities, however, contain Mifare PlusX 4K chips, which seem to send a new key every time one uses the card. I have tried experimenting with the settings the configuration tool of the reader offers, but to no avail. I can make the 1K classic cards send a much longer key by changing the end block parameter but the PlusX 4K keeps sending the shorter, and painfully non-consistent, keys.
I am a physicist without a deeper understanding of these chips and RFID authentication in general – I am just trying to make a job done that seemed easy at the beginning. I have no intention of cracking or abusing these cards in any way, I am just trying to find some block of data on the card that stays consistent upon each use, does not require complicated authentication protocols but is unique between different cards.
Is it possible or is it against the philosophy of these chips? If possible, shall I have to buy a new reader or can I make it do what I need?
Your thoughts are much appreciated.

From the MiFare PlusX 4K datasheet:
Section 8.2:
There are three different versions of the PICC. The UID is programmed into a locked part
of the NV-memory reserved for the manufacturer:
• unique 7-byte serial number
• unique 4-byte serial number
• non-unique 4-byte serial number
Due to security and system requirements, these bytes are write-protected after being
programmed by the PICC manufacturer at production.
...
During personalization, the PICC can be configured to support Random ID in security
level 3. The user can configure whether Random ID or fixed UID shall be used. According
to ISO/IEC 14443-3 the first anticollision loop (see Ref. 5) returns the Random Number
Tag 08h, the 3-byte Random Number and the BCC, if Random ID is used. The retrieval of
the UID in this case can be done using the Virtual Card Support Last command, see
Ref. 3 or by reading out block 0.
From what you have described, it appears that the cards are running in Security Level 3, and unfortunately, the backwards-compatible part of the card only exists at lower security levels. The mentioned command of Virtual Card Support Last is also only available after level 3 authentication.
I'm afraid what you want to do appears impossible unless you can use the ISO/IEC 14443-4 protocol layer, which I think would let you authenticate at level 3? The relevant data appears to be in section 8.7, and involves AES authentication.

Related

What are examples of real-world scenarios where a message queuing system can accept the loss of some messages?

I was reading this blog post, in which the author proposes the following question, in the context of message queues:
does it matter if a message is lost? If you application node, processing the request, dies, can you recover? You’ll be surprised how often it doesn’t actually matter, and you can function properly without guaranteeing all messages are processed
At first I thought that the main point of handling messages was to never loose a single message - after all, a message lost could mean a hotel reservation not booked, a checkout not completed, or any other functionality not carried through, which seems too similar to a bug for me. I suppose I am missing something, so, what are examples of scenarios where it is OK for a messaging system to loose a few messages?
Well, your initial expectation:
the main point of handling messageswas to never loose a single message
was just not a correct one.
Right, if one strives for a one certain type of robustness, where fail-safe measures have to take all due care and precautions, so as not a single message could get lost, yes, there your a priori expressed expectation fits.
This does not mean that all other system designs have to carry all the immense burdens and have to pay all that incurred costs ( resources-wise, latency-wise et al ), as the "100+% guaranteed delivery" systems do ( but, again, only if they can ).
Anti-pattern cases:
There are many use-cases, where an absolute certainty of delivery of each and every message originally sent is actually an anti-pattern.
Just imagine a weakly synchronised system ( including ones, that have nothing like backthrottling or even any simplest form of feedback propagation at all ), where the sensors read an actual temperature, a sound, a video-frame and send a message with that value(s).
Whenever a postprocessing system gets such information delivered, there may be a reason not to read any and all "old" values, but the most recent one(s).
If a delivery framework already got any newer set of values, all the "older" values, not processed yet, just hanging at some depth from the queue-head, yet in the queue, might create the anti-pattern, where one would not like to have to read and process any and all of those "older" values, but just the most recent one(s).
Like no one will make a trade with you based on yesterday prices, there is not positive value to make any new, current, decision based on reading any and all "old" temperature readings, that still wait in the queue.
Some smart-messaging frameworks provide explicit means for taking just the very "newest" message from a given source - thus enabling to imperatively discard any "older" messages, avoiding them from being read and processed right due to a known presence of a one "most" recent.
This answers the original question about the assumed main point of handling messages.
Efficiency first:
In any case, where a smart-delivery takes place ( either deliver an exact copy of the original message content or noting-at-all ), the resources are used at their best efforts, yet, without spending a single penny on anything but the "just-enough" smart-delivery.
Building robustness costs more than that.
Building an ultimate robustness, costs even way more than that.
Systems than do have such an extreme requirement can and may extend the resources-efficient smart-delivery so as to reach some requirements defined level of robustness, at some add-on costs.
The same but reversed is not possible -- if an "everything-proof" system is to get a slimmer form and fashion, so as to fit onto any restricted-resources hardware or to make it "forget" some "old" messages, that are of no positive value at this very moment ( but on the contrary, constitute a must for the processing element to read and process each and every "unwanted" message, just due to the fact it was delivered, while knowing a core-logic needs just the most recent one ).
Distributed systems accrue E2E-latency from many distributed sources, so any rigid-delivery system just block and penalise the only one element, who is ( latency-wise ) innocent -- the receiver.
I suppose it's OK to loose few messages from some measurement units that deliver the value once in.... Also for big data analytics solutions few lost messages won't make a big difference
It all depends on the application/larger system. The message queue is only one link in the chain, so to speak. If the application(s) at the ends are prepared to deal with loss, losing some messages is not a problem. If the application(s) rely on total messaging integrity then there will be problems.
An example of a system that will be ok with loss is weather updates for your phone. If a few temperature/wind updates don't make it to you there's no real harm in that.
Now, if you're running a nuclear reactor and you lose a few temperature updates on the core, well that is a problem.
I work a lot on safety critical, infrastructure-level systems, and am responsible for messaging much of the time. Many of those systems state clearly that messaging may reorder, duplicate, or lose messages; it's just a fact of life where distributed systems and networks are involved. The endpoint systems need to be designed to work correctly in that environment. So they track messages, ack end to end, deal with duplicates and retransmits, etc.

Adobe Air unique id issue

I created an AIR app which sends an ID to my server to verify the user's licence.
I created it using
NetworkInfo.networkInfo.findInterfaces() and I use the first "name" value for "displayName" containing "LAN" (or first mac address I get if the user is on a MAC).
But I get a problem:
sometime users connect to internet using an USB stick (given from a mobile phone company) and it changes the serial number I get; probably the USB stick becomes the first value in the vector of findInterfaces().
I could take the last value, but I think I could get similar problems too.
So is there a better way to identify the computer even with this small hardware changes?
It would be nice to get motherboard or CPU serial, but it seems to be not possible. I've found some workaround to get it, but working on WIN and not on a MAC.
I don't want to store data on the user computer for authentication to set "a little" more difficult to hack the software.
Any idea?
Thanks
Nadia
So is there a better way to identify the computer even with this small hardware changes?
No, there is no best practices to identify personal computer and build on this user licensing for the software. You should use server-side/licensing-manager to provide such functional. Also it will give your users flexibility with your desktop software. It's much easier as for product owner (You don't have call center that will respond on every call with changed Network card, hard drive, whatever) and for users to use such product.
Briefly speaking, user's personal computer is insecure (frankly speaking you don't have options to store something valuable) and very dynamic environment (There is very short cycle on the hardware to use it as part of licensing program).
I am in much the same boat as you, and I am now finally starting to address this... I have researched this for over a year and there are a couple options out there.
The biggest thing to watch out for when using a 3rd party system is the leach effect. Nearly all of them want a percentage of your profit - which in my mind makes it nothing more than vampireware. This is on top of a percentage you WILL pay to paypal, merchant processor, etc.
The route I will end up taking is creating a secondary ANE probably written in Java because of 1) Transitioning my knowledge 2) Ability to run on various architectures. I have to concede this solution is not fool proof since reverse engineering of java is nearly as easy as anything running on FP. The point is to just make it harder, not bullet proof.
As a side note - any naysayers of changing CPU / Motherboard - this is extremely rare if not no longer even done. I work on a laptop and obviously once that hardware cycle is over, I need to reregister everything on a new one. So please...
Zarqon was developed by: Cliff Hall
This appears to be a good solution for small scale. The reason I do not believe it scales well based on documentation (say beyond a few thousand users) is it appears to be a completely manual process ie-no ability to tie into a payment system to then auto-gen / notify the user of the key (I could be wrong about this).
Other helpful resources:
http://www.adobe.com/devnet/flex/articles/flex_paypal.html

Are cryptographic hashes injective under certain conditions?

sorry for the lengthy post, I have a question about common crpytographic hashing algorithms, such as the SHA family, MD5, etc.
In general, such a hash algorithm cannot be injective, since the actual digest produced has usually a fixed length (e.g., 160 bits under SHA-1, etc.) whereas the space of possible messages to be digested is virtually infinite.
However, if we generate a digest of a message which is at most as long as the digest generated, what are the properties of the commonly used hashing algorithms? Are they likely to be injective on this limited message space? Are there algorithms, which are known to produce collisions even on messages whose bit lengths is shorter than the bit length of the digest produced?
I am actually looking for an algorithm, which has this property, i.e., which, at least in principle, may generate colliding hashes even for short input messages.
Background: We have a browser plug-in, which for every web-site visited, makes a server request asking, whether the web-site belongs to one of our known partners. But of course, we don't want to spy on our users. So, in order to make it hard to generate some kind of surf history, we do not actually send the URL visited, but a hash digest (currently, SHA-1) of some cleaned up version. On the server side, we have a table of hashes of well-known URIs, which is matched against the received hash. We can live with a certain amount of uncertainty here, since we consider not being able to track our users a feature, not a bug.
For obvious reasons, this scheme is pretty fuzzy and admits false positives as well as URIs not matched which should have.
So right now, we are considering changing the fingerprint generation to something, which has more structure, for example, instead of hashing the full (cleaned up) URI, we might instead:
split the host name into components at "." and hash those individually
check the path into components at "." and hash those individually
Join the resulting hashes into a fingerprint value. Example: hashing "www.apple.com/de/shop" using this scheme (and using Adler 32 as hash) might yield "46989670.104268307.41353536/19857610/73204162".
However, as such a fingerprint has a lot of structure (in particular, when compared to a plain SHA-1 digest), we might accidentally make it pretty easy again to compute actual URI visited by a user (for example, by using a pre-computed table of hash values for "common" compont values, such as "www").
So right now, I am looking for a hash/digest algorithm, which has a high rate of collisions (Adler 32 is seriously considered) even on short messages, so that the probability of a given component hash being unique is low. We hope, that the additional structure we impose provides us with enough additional information to as to improve the matching behaviour (i.e., lower the rate of false positives/false negatives).
I do not believe hashes are guaranteed to be injective for messages the same size the digest. If they were, they would be bijective, which would be missing the point of a hash. This suggests that they are not injective for messages smaller than the digest either.
If you want to encourage collisions, i suggest you use any hash function you like, then throw away bits until it collides enough.
For example, throwing away 159 bits of a SHA-1 hash will give you a pretty high collision rate. You might not want to throw that many away.
However, what you are trying to achieve seems inherently dubious. You want to be able to tell that the URL is one of your ones, but not which one it is. That means you want your URLs to collide with each other, but not with URLs that are not yours. A hash function won't do that for you. Rather, because collisions will be random, since there are many more URLs which are not yours than which are (i assume!), any given level of collision will lead to dramatically more confusion over whether a URL is one of yours or not than over which of yours it is.
Instead, how about sending the list of URLs to the plugin at startup, and then having it just send back a single bit indicating if it's visiting a URL in the list? If you don't want to send the URLs explicitly, send hashes (without trying to maximise collisions). If you want to save space, send a Bloom filter.
Since you're willing to accept a rate of false positives (that is, random sites identified as whitelisted when in fact they are not), a Bloom filter might be just the thing.
Each client downloads a Bloom filter containing the whole whitelist. Then the client has no need to otherwise communicate with the server, and there is no risk of spying.
At 2 bytes per URL, the false positive rate would be below 0.1%, and at 4 bytes per URL below 1 in 4 million.
Downloading the whole filter (and perhaps regular updates to it) is a large investment of bandwidth up front. But supposing that it has a million URLs on it (which seems quite a lot to me, given that you can probably apply some rules to canonicalize URLs before lookup), it's a 4MB download. Compare this with a list of a million 32 bit hashes: same size, but the false positive rate would be somewhere around 1 in 4 thousand, so the Bloom filter wins for compactness.
I don't know how the plugin contacts the server, but I doubt that you can get an HTTP transaction done in much under 1kB -- perhaps less with keep-alive connections. If filter updates are less frequent than one per 4k URL visits by a given user (or a smaller number if there are less than a million URLs, or greater than 1 in 4 million false positive probability), this has a chance of using less bandwidth than the current scheme, and of course leaks much less information about the user.
It doesn't work quite as well if you require new URLs to be whitelisted immediately, although I suppose the client could still hit the server at every page request, to check whether the filter has changed and if so download an update patch.
Even if the Bloom filter is too big to download in full (perhaps for cases where the client has no persistent storage and RAM is limited), then I think you could still introduce some information-hiding by having the client compute which bits of the Bloom filter it needs to see, and asking for those from the server. With a combination of caching in the client (the higher the proportion of the filter you have cached, the fewer bits you need to ask for and hence the less you tell the server), asking for a window around the actual bit you care about (so you don't tell the server exactly which bit you need), and the client asking for spurious bits it doesn't really need (hide the information in noise), I expect you could obfuscate what URLs you're visiting. It would take some analysis to prove how much that actually works, though: a spy would be aiming to find a pattern in your requests that's correlated with browsing a particular site.
I'm under the impression that you actually want public key cryptography, where you provide the visitor with a public key used to encode the URL, and you decrypt the URL using the secret key.
There are JavaScript implementations a bit everywhere.

Recommendations for a programmable drivers license scanner? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 3 years ago.
Improve this question
Our motor pool wants to scan drivers’ licenses and have the data imported into our custom system. We're looking for something that will allow us to programmatically get the data from the scanner (including the picture) and let us insert it into our application. I was wondering if anyone has had experience with this type of system and could recommend one or tell us which ones to avoid. Our application is written in PowerBuilder and uses a DB2 database.
Try solutions by idScan.net (www.idScan.net)
There is SDK that will allow drivers license parsing for all states in the USA and Canadian provinces. You can also purchase hardware such as ID scanner E-seek m250 that reads both 2D barcode and magnetic stripes (software is included).
Good luck!
We support something similar in our records management software. Our application is designed to work with a wedge reader, since they are the easiest to get up and running (no special drivers needed). When a card is swiped, the reader sends keystrokes to the OS for each character that is encoded on the magnetic stripe, with a simulated Enter keypress between each track (an AAMVA-compliant license has 3 data tracks).
It's slightly annoying because it behaves exactly as if someone was typing out the data by hand, so there is no easy way to tell when you have all the data (you could just wait to get 3 lines of information, but then it's difficult to detect invalid cards, such as when someone tries to swipe a student ID card, which might have fewer than 3 tracks encoded; in this case, the application hangs forever waiting for the non-existent third track to be received). To deal with this, we use a "fail-fast" approach: each time we get an Enter keypress, we immediately process the current line, keeping a record of which track we are expecting at that point (1, 2, or 3). If the current track cannot be processed (for example, a different start character appears on the track that what is documented for an AAMVA format driver's license), we assume the user must have swiped something other than a driver's license.
I'm not sure if the reader we use supports reading image data or not. It can be programmed to return a subset of the data on the card, but we just use the factory default setting, which appears to return only the first three data tracks (and actually I believe image data is encoded in the 2D barcode found on some licenses, not on the magnetic stripe, but I could be wrong).
For more on the AAMVA track format that is used on driver's license magstripes, see Annex F in the current standard.
The basic approach we use is:
Display a modal dialog that has a hidden textbox, which is given focus. The dialog box simply tells the user to swipe the card through the reader.
The user swipes the card, and the reader starts sending keydown events to the hidden textbox.
The keydown event handler for the textbox watches for Enter keypresses. When one is detected, we grab the last line currently stored in the textbox, and pass it to a track parser that attempts to parse the track according to the AAMVA format.
If this "fail-fast" parsing step fails for the current track, we change the dialog's status message to a message telling the user the card could not be read. At this point, the textbox will still receive additional keydown events, but it's OK because subsequent tracks have a high enough chance of also failing that the user will still see the error message whenever the reader stops sending data.
If the parsing is successful, we increment a counter that tells the parser what track it should process next.
If the current track count is greater than 3, we know we've processed 3 tracks. At this point we parse the 3 tracks (which have already split most of the fields up but everything is still stored as strings at this point) into a more usable DriversLicense object, which does additional checks on the track data, and makes it more consumable from our application (converting the DOB field from a string into a real Date object, parsing out the subfields in the AAMVA Name field into first name, middle name, last name, name suffix, etc.). If this second parsing phase fails, we tell the user to reswipe the card. If it succeeds, we close the dialog and pass the DriversLicense object to our main application for further processing.
If your scanner is "twain compliant", You will then be able to manage it from your app through an ActiveX control you can buy on the net like this one. You'll be able to manage your basic scan parameters (quality, color, single/multiple pages can, output format, etc), start the scan from your app, save the result as a file and transfer this file wherever needed. We have been using it with VB code for the last 2 years. It works.
Maybe you want to use magnetic stripe reader, to get driver license info from the card. As I remember most of the Driver licenses just have the data in plain text on those stripes, so it is relatively stright forward programming-wise.
MagStripe readers are also cheap now days.
You can try something from this list: http://www.adams1.com/plugins.html
I have not used them myself, though.
I wrote a parser in C#, and while it's "ok" it's still far from perfect.
I can't seem to find it but a Wikipedia entry used to exist that has the patterns to look for (trust me, parsing this yourself is a pain without any help).
Be aware that different states have different laws for what you can and can't use government issued ID's for. Texas has one.
We use a dell card reader and it inputs it exactly as though it were being typed through a keyboard, followed by the enter key. This made programming /very/ easy because then you just send focus to the text box and wait for enter. The main keys which break it in to chunks is the carrot '^'. Break that and you'll have your basic chunks.
You can also use InfoScan SDK. You can find it on www.scan-monitor.com the system allows you to use any scanner and does not make you purchase a specific scanner.

How would you go about reverse engineering a set of binary data pulled from a device?

A friend of mine brought up this questiont he other day, he's recently bought a garmin heart rate moniter device which keeps track of his heart rate and allows him to upload his heart rate stats for a day to his computer.
The only problem is there are no linux drivers for the garmin USB device, he's managed to interpret some of the data, such as the model number and his user details and has identified that there are some binary datatables essentially which we assume represent a series of recordings of his heart rate and the time the recording was taken.
Where does one start when reverse engineering data when you know nothing about the structure?
I had the same problem and initially found this project at Google Code that aims to complete a cross-platform version of tools for the Garmin devices ... see: http://code.google.com/p/garmintools/. There's a link on the front page of that project to the protocols you need, which Garmin was thoughtful enough to release publically.
And here's a direct link to the Garmin I/O specification: http://www.garmin.com/support/pdf/IOSDK.zip
I'd start looking at the data in a hexadecimal editor, hopefully a good one which knows the most common encodings (ASCII, Unicode, etc.) and then try to make sense of it out of the data you know it has stored.
As another poster mentioned, reverse engineering can be hairy, not in practice but in legality.
That being said, you may be able to find everything related to your root question at hand by checking out this project and its' code...and they do handle the runner's heart rate/GPS combo data as well
http://www.gpsbabel.org/
I'd suggest you start with checking the legality of reverse engineering in your country of origin. Most countries have very strict laws about what is allowed and what isn't regarding reverse engineering devices and code.
I would start by seeing what data is being sent by the device, then consider how such data could be represented and packed.
I would first capture many samples, and see if any pattern presents itself, since heart beat is something which is regular and that would suggest it is measurement related to the heart itself. I would also look for bit fields which are monotonically increasing, as that would suggest some sort of time stamp.
Having formed a hypothesis for what is where, I would write a program to test it and graph the results and see if it makes sense. If it does but not quite, then closer inspection would probably reveal you need some scaling factors here or there. It is also entirely possible I need to process the data first before it looks anything like what their program is showing, i.e. might need to integrate the data points. If I get garbage, then it is back to the drawing board :-)
I would also check the manufacturer's website, or maybe run strings on their binaries. Finding someone who works in the field of biomedical engineering would also be on my list, as they would probably know what protocols are typically used, if any. I would also look for these protocols and see if any could be applied to the data I am seeing.
I'd start by creating a hex dump of the data. Figure it's probably blocked in some power-of-two-sized chunks. Start looking for repeating patterns. Think about what kind of data they're probably sending. Either they're recording each heart beat individually, or they're recording whatever the sensor is sending at fixed intervals. If it's individual beats, then there's going to be a time delta (since the last beat), a duration, and a max or avg strength of some sort. If it's fixed intervals, then it'll probably be a simple vector of readings. There'll probably be a preamble of some sort, with a start timestamp and the sampling rate. You can try decoding the timestamp yourself, or you might try simply feeding it to ctime() and see if they're using standard absolute time format.
Keep in mind that lots of cheap A/D converters only produce 12-bit outputs, so your readings are unlikely to be larger than 16 bits (and the high-order 4 bits may be used for flags). I'd recommend resetting the device so that it's "blank", dumping and storing the contents, then take a set of readings, record the results (whatever the device normally reports), then dump the contents again and try to correlate the recorded results with whatever data appeared after the "blank" dump.
Unsure if this is what you're looking for but Garmin has created an API that runs with your browser. It seems OSX is supported, as well as Windows browsers... I would try it from Google Chromium to see if it can be used instead of this reverse engineering...
http://developer.garmin.com/web-device/garmin-communicator-plugin/
API Features
Auto-detection of devices connected to a computer Access to device
product information like product name and software version Read
tracks, routes and waypoints from supported recreational, fitness and
navigation devices Write tracks, routes and waypoints to supported
recreational, fitness and navigation devices Read fitness data from
supported fitness devices Geo-code address and save to a device as a
waypoint or favorite Read and write Garmin XML files (GPX and TCX) as
well as binary files. Support for most Garmin devices (USB, USB
mass-storage, most serial devices) Support for Internet Explorer,
Firefox and Chrome on Microsoft Windows. Support for Safari, Firefox
and Chrome on Mac OS X.
Can you synthesize a heart beat using something like a computer speaker? (I have no idea how such devices actually work). Watch how the binary results change based on different inputs.
Ripping apart the device and checking out what's inside would probably help too.