Why public key algorithms are slow? - public-key

I'm studying for a test and I'm still didn't get it why public key algorithms are way slower than symetric algorithms.

Public-key cryptography is a form of asymmetric cryptography, in which the difference is the use of an extra cryptographic key.
Symmetric algorithms use a "shared secret" in which two systems each use a single cryptographic key to encrypt and decrypt communications.
Public-key cryptography does not use a single shared key, instead it uses mathematical key-pairs: a public and private key. In this system the communications are encrypted with the public key and is decrypted with the private key. Here is a better explanation from Wikipedia:
The distinguishing technique used in
public key cryptography is the use of
asymmetric key algorithms, where the
key used to encrypt a message is not
the same as the key used to decrypt
it. Each user has a pair of
cryptographic keys—a public encryption
key and a private decryption key. The
publicly available encrypting-key is
widely distributed, while the private
decrypting-key is known only to the
recipient. Messages are encrypted with
the recipient's public key and can
only be decrypted with the
corresponding private key. The keys
are related mathematically, but the
private key cannot feasibly (ie. in
actual or projected practice) be
derived from the public key. The
discovery of algorithms that could
produce public/private key pairs
revolutionized the practice of
cryptography beginning in the middle
1970s.
The computational overhead is then quite obvious: the public key is available to any system it's exposed to (a public-key system on the internet, for example exposes the public-key to the entire internet). To compensate, both public and private keys will have to be quite large to ensure a stronger level of encryption. The result, however, is a much stronger level of encryption, as the private decryption key (so far) cannot be reverse-engineered from the public encryption key.
There is more that can affect the "speed" of a public-key infrastructure (PKI). Since one of the issues with this system is trust, most implementations involve a certificate authority (CA), which are entities that are trusted to delegate key pairs and validate the keys' "identity".
So to summarize: larger cryptographic key sizes, two cryptographic keys instead of one, and with the introduction of a certificate authority: extra DNS look-ups, and server response times.
It's because of this extra overhead that most implementations benefit from a hybrid algorithm, where the public and private keys are used to generate a session key (much like a shared secret in symmetrical algorithms) to gain the best of both worlds.

Public key algorithms rely on "trapdoor" calculations, ones that are computationally expensive to encrypt and computationally intractable to decrypt with the secret key. If the first step is too easy (which correlates with speed), the second step becomes less hard (more breakable). Consequently, public key algorithms tend to be resource intensive.
Private key algorithms already have the secret during the encryption phase, so they don't have to do as much work as an algorithm with a public secret.
The above is an over-generalization but should give you a feel for the reasons behind the relative speed differences. That being said, a private key algorithm can be slow and a public key algorithm may have an efficient implementation. The devil is in the details :-)

Encryption and keying methods are a very deep and complex topic that only the smartest mathematical minds in the world can fully understand, but there are top-level views that most people can understand.
The primary difference is that symmetric algorithms require a much, much smaller key than asymmetric (PKI) methods. Because symmetric algorithms work on a "shared secret" (such as abcd1234) which is transferred inside a trusted communication method (for example, I'm going to call you on the telephone and ask you for the shared secret) then they don't need to be as long as they rely on other methods of security (i.e. I trust you not to tell that to anyone).
PK infrastructure involves sending that "key" over the internet, over un-trusted space, and involves using huge prime numbers and massive keys (1024-bit or 2048-bit rather than 128 or 256-bit for example).
A general rule of thumb is that PKI methods are approximately 1,000 times slower than a symmetric key.

Related

ES6 Maps and Sets: how are object keys indexed efficiently?

In ES6, Maps and Sets can use Objects as keys. However since the ES6 specification does not dictate the underlying implementation of these datastructures, I was wondering how does the modern JS engines store the keys in order to guarantee O(1) or at least sublinear retrieval?
In a language like Java, the programmer can explicitly provide a (good) hashCode method which would hash the keys evenly in the key space in order to guarantee the performance. However since JS does not have such features, would it still be fair to still assume they use some sort of hashing in the Maps and Sets implementation?
Any information will be appreciated!
Yes, the implementation is based on hashing, and has (amortized) constant access times.
"they use object identity" is a simplification; the full story is that ES Maps and Sets use the SameValueZero algorithm for determining equality.
In line with this specification, V8's implementation computes "real" hashes for strings and numbers, and chooses a random number as "hash" for objects, which it stores as a private (hidden) property on these objects for later accesses. (That's not quite ideal and might change in the future, but for now that's what it is.)
Using memoryAddress % keySpace cannot work because the garbage collector moves objects around, and rehashing all Maps and Sets every time any object might have moved would be prohibitively complicated and expensive.

MySQL: AES_ENCRYPT vs DES_ENCRYPT [duplicate]

Does anyone have pros and cons together for comparing these encryption algorithms ?
Use AES.
In more details:
DES is the old "data encryption standard" from the seventies. Its key size is too short for proper security (56 effective bits; this can be brute-forced, as has been demonstrated more than ten years ago). Also, DES uses 64-bit blocks, which raises some potential issues when encrypting several gigabytes of data with the same key (a gigabyte is not that big nowadays).
3DES is a trick to reuse DES implementations, by cascading three instances of DES (with distinct keys). 3DES is believed to be secure up to at least "2112" security (which is quite a lot, and quite far in the realm of "not breakable with today's technology"). But it is slow, especially in software (DES was designed for efficient hardware implementation, but it sucks in software; and 3DES sucks three times as much).
Blowfish is a block cipher proposed by Bruce Schneier, and deployed in some softwares. Blowfish can use huge keys and is believed secure, except with regards to its block size, which is 64 bits, just like DES and 3DES. Blowfish is efficient in software, at least on some software platforms (it uses key-dependent lookup tables, hence performance depends on how the platform handles memory and caches).
AES is the successor of DES as standard symmetric encryption algorithm for US federal organizations (and as standard for pretty much everybody else, too). AES accepts keys of 128, 192 or 256 bits (128 bits is already very unbreakable), uses 128-bit blocks (so no issue there), and is efficient in both software and hardware. It was selected through an open competition involving hundreds of cryptographers during several years. Basically, you cannot have better than that.
So, when in doubt, use AES.
Note that a block cipher is a box which encrypts "blocks" (128-bit chunks of data with AES). When encrypting a "message" which may be longer than 128 bits, the message must be split into blocks, and the actual way you do the split is called the mode of operation or "chaining". The naive mode (simple split) is called ECB and has issues. Using a block cipher properly is not easy, and it is more important than selecting between, e.g., AES or 3DES.
All of these schemes, except AES and Blowfish, have known vulnerabilities and should not be used.
However, Blowfish has been replaced by Twofish.
The encryption methods described are symmetric key block ciphers.
Data Encryption Standard (DES) is the predecessor, encrypting data in 64-bit blocks using a 56 bit key. Each block is encrypted in isolation, which is a security vulnerability.
Triple DES extends the key length of DES by applying three DES operations on each block: an encryption with key 0, a decryption with key 1 and an encryption with key 2. These keys may be related.
DES and 3DES are usually encountered when interfacing with legacy commercial products and services.
AES is considered the successor and modern standard. http://en.wikipedia.org/wiki/Advanced_Encryption_Standard
I believe the use of Blowfish is discouraged.
It is highly recommended that you do not attempt to implement your own cryptography and instead use a high-level implementation such as GPG for data at rest or SSL/TLS for data in transit. Here is an excellent and sobering video on encryption vulnerabilities http://rdist.root.org/2009/08/06/google-tech-talk-on-common-crypto-flaws/
AES is a symmetric cryptographic algorithm, while RSA is an asymmetric (or public key) cryptographic algorithm. Encryption and decryption is done with a single key in AES, while you use separate keys (public and private keys) in RSA. The strength of a 128-bit AES key is roughly equivalent to 2600-bits RSA key.
Although TripleDESCryptoServiceProvider is a safe and good method but it's too slow. If you want to refer to MSDN you will get that advise you to use AES rather TripleDES. Please check below link:
http://msdn.microsoft.com/en-us/library/system.security.cryptography.tripledescryptoserviceprovider.aspx
you will see this attention in the remark section:
Note
A newer symmetric encryption algorithm, Advanced Encryption
Standard (AES), is available. Consider using the
AesCryptoServiceProvider class instead of the
TripleDESCryptoServiceProvider class. Use
TripleDESCryptoServiceProvider only for compatibility with legacy
applications and data.
Good luck
DES is the old "data encryption standard" from the seventies.
All of these schemes, except AES and Blowfish, have known vulnerabilities and should not be used.
All of them can actually be securely used if wrapped.
Here is an example of AES wrapping.
DES
AES
Developed
1977
2000
Key Length
56 bits
128, 192, or 256 bits
Cipher Type
Symmetric
Symmetric
Block Size
64 bits
128 bits
Security
inadequate
secure
Performance
Fast
Slow
AES is the currently accepted standard algorithm to use (hence the name Advanced Encryption Standard).
The rest are not.

What is the difference between message-passing and method-invocation?

Is there a difference between message-passing and method-invocation, or can they be considered equivalent? This is probably specific to the language; many languages don't support message-passing (though all the ones I can think of support methods) and the ones that do can have entirely different implementations. Also, there are big differences in method-invocation depending on the language (C vs. Java vs Lisp vs your favorite language). I believe this is language-agnostic. What can you do with a passed-method that you can't do with an invoked-method, and vice-versa (in your favorite language)?
Using Objective-C as an example of messages and Java for methods, the major difference is that when you pass messages, the Object decides how it wants to handle that message (usually results in an instance method in the Object being called).
In Java however, method invocation is a more static thing, because you must have a reference to an Object of the type you are calling the method on, and a method with the same name and type signature must exist in that type, or the compiler will complain. What is interesting is the actual call is dynamic, although this is not obvious to the programmer.
For example, consider a class such as
class MyClass {
void doSomething() {}
}
class AnotherClass {
void someMethod() {
Object object = new Object();
object.doSomething(); // compiler checks and complains that Object contains no such method.
// However, through an explicit cast, you can calm the compiler down,
// even though your program will crash at runtime
((MyClass) object).doSomething(); // syntactically valid, yet incorrect
}
}
In Objective-C however, the compiler simply issues you a warning for passing a message to an Object that it thinks the Object may not understand, but ignoring it doesn't stop your program from executing.
While this is very powerful and flexible, it can result in hard-to-find bugs when used incorrectly because of stack corruption.
Adapted from the article here.
Also see this article for more information.
as a first approximation, the answer is: none, as long as you "behave normally"
Even though many people think there is - technically, it is usually the same: a cached lookup of a piece of code to be executed for a particular named-operation (at least for the normal case). Calling the name of the operation a "message" or a "virtual-method" does not make a difference.
BUT: the Actor language is really different: in having active objects (every object has an implicit message-queue and a worker thread - at least conceptionally), parallel processing becones easier to handle (google also "communicating sequential processes" for more).
BUT: in Smalltalk, it is possible to wrap objects to make them actor-like, without actually changing the compiler, the syntax or even recompiling.
BUT: in Smalltalk, when you try to send a message which is not understoof by the receiver (i.e. "someObject foo:arg"), a message-object is created, containing the name and the arguments, and that message-object is passed as argument to the "doesNotUnderstand" message. Thus, an object can decide itself how to deal with unimplemented message-sends (aka calls of an unimplemented method). It can - of course - push them into a queue for a worker process to sequentialize them...
Of course, this is impossible with statically typed languages (unless you make very heavy use of reflection), but is actually a VERY useful feature. Proxy objects, code load on demand, remote procedure calls, learning and self-modifying code, adapting and self-optimizing programs, corba and dcom wrappers, worker queues are all built upon that scheme. It can be misused, and lead to runtime bugs - of course.
So it it is a two-sided sword. Sharp and powerful, but dangerous in the hand of beginners...
EDIT: I am writing about language implementations here (as in Java vs. Smalltalk - not inter-process mechanisms.
IIRC, they've been formally proven to be equivalent. It doesn't take a whole lot of thinking to at least indicate that they should be. About all it takes is ignoring, for a moment, the direct equivalence of the called address with an actual spot in memory, and consider it simply as a number. From this viewpoint, the number is simply an abstract identifier that uniquely identifies a particular type of functionality you wish to invoke.
Even when you are invoking functions in the same machine, there's no real requirement that the called address directly specify the physical (or even virtual) address of the called function. For example, although almost nobody ever really uses them, Intel protected mode task gates allow a call to be made directly to the task gate itself. In this case, only the segment part of the address is treated as an actual address -- i.e., any call to a task gate segment ends up invoking the same address, regardless of the specified offset. If so desired, the processing code can examine the specified offset, and use it to decide upon an individual method to be invoked -- but the relationship between the specified offset and the address of the invoked function can be entirely arbitrary.
A member function call is simply a type of message passing that provides (or at least facilitates) an optimization under the common circumstance that the client and server of the service in question share a common address space. The 1:1 correspondence between the abstract service identifier and the address at which the provider of that service reside allows a trivial, exceptionally fast, mapping from one to the other.
At the same time, make no mistake about it: the fact that something looks like a member function call doesn't prevent it from actually executing on another machine or asynchronously, or (frequently) both. The typical mechanism to accomplish this is proxy function that translates the "virtual message" of a member function call into a "real message" that can (for example) be transmitted over a network as needed (e.g., Microsoft's DCOM, and CORBA both do this quite routinely).
They really aren't the same thing in practice. Message passing is a way to transfer data and instructions between two or more parallel processes. Method invocation is a way to call a subroutine. Erlang's concurrency is built on the former concept with its Concurrent Oriented Programing.
Message passing most likely involves a form of method invocation, but method invocation doesn't necessarily involve message passing. If it did it would be message passing. Message passing is one form of performing synchronization between to parallel processes. Method invocation generally means synchronous activities. The caller waits for the method to finish before it can continue. Message passing is a form of a coroutine. Method-invocation is a form of subroutine.
All subroutines are coroutines, but all coroutines are not subroutines.
Is there a difference between message-passing and method-invocation, or can they be considered equivalent?
They're similar. Some differences:
Messages can be passed synchronously or asynchronously (e.g. the difference between SendMessage and PostMessage in Windows)
You might send a message without knowing exactly which remote object you're sending it to
The target object might be on a remote machine or O/S.

Is AES_256 stronger than blowfish

I'm considering using mysql's built-in aes_encrypt. I normally use blowfish, but mysql doesn't seem to support it natively. How do the 2 compare together? Is one stronger than the other?
AES has a higher design strength than Blowfish - in particular it uses 128 bit blocks, in contrast with Blowfish's 64 bit block size. It's also just much newer - it has the advantage of incorporating several more years of advances in the cryptographic art.
It may interest you to know that the designers behind Blowfish went on to design an improved algorithm called Twofish, which was an entrant (and finalist) in the AES competition.
If you are only looking at security then these two algorithms ranks more or less the same. There is some implementation differences so unless you want to use an external function just go with the build in AES function. If you are going to do it yourself you might want to use a newer encryption algorithm than Blowfish.
You may be interested in the best public cryptanalysis for both algorithms:
For AES, there exists a related-key attack on the 192-bit and 256-bit versions, discovered by Alex Biryukov and Dmitry Khovratovich, which exploits AES's key scheduling in 2^99.5 operations. This is faster than brute force, but still somewhat infeasible. 128-bit AES is not affected by this attack.
For Blowfish, four of its rounds are susceptible to a second-order differential attack (Rijmen, 1997). It can also be distinguished (as in, "Hey, this box is using Blowfish") for a class of weak keys. However, there is no effective cryptanalysis on the full-round version of Blowfish at this moment.
This is pretty subjective, but I'd say AES is more widely used than Blowfish and has been proven secure over the years. So, why not?

Can garbage collection coexist with explicit memory management?

For example, say one was to include a 'delete' keyword in C# 4. Would it be possible to guarantee that you'd never have wild pointers, but still be able to rely on the garbage collecter, due to the reference-based system?
The only way I could see it possibly happening is if instead of references to memory locations, a reference would be an index to a table of pointers to actual objects. However, I'm sure that there'd be some condition where that would break, and it'd be possible to break type safety/have dangling pointers.
EDIT: I'm not talking about just .net. I was just using C# as an example.
You can - kind of: make your object disposable, and then dispose it yourself.
A manual delete is unlikely to improve memory performance in a managed environment. It might help with unmanaged ressources, what dispose is all about.
I'd rather have implementing and consuming Disposable objects made easier. I have no consistent, complete idea how this should look like, but managing unmanaged ressources is a verbose pain under .NET.
An idea for implementing delete:
delete tags an object for manual deletion. At the next garbage collection cycle, the object is removed and all references to it are set to null.
It sounds cool at first (at least to me), but I doubt it would be useful.
This isn't particulary safe, either - e.g. another thread might be busy executing a member method of that object, such an methods needs to throw e.g. when accessing object data.
With garbage collection, as long as you have a referenced reference to the object, it stays alive. With manual delete you can't guarantee that.
Example (pseudocode):
obj1 = new instance;
obj2 = obj1;
//
delete obj2;
// obj1 now references the twilightzone.
Just to be short, combining manual memory management with garbage collection defeats the purpose of GC. Besides, why bother? And if you really want to have control, use C++ and not C#. ;-).
The best you could get would be a partition into two “hemispheres” where one hemisphere is managed and can guarantee the absence of dangling pointers. The other hemisphere has explicit memory management and gives no guarantees. These two can coexist, but no, you can't give your strong guarantees to the second hemisphere. All you could do is to track all pointers. If one gets deleted, then all other pointers to the same instance could be set to zero. Needless to say, this is quite expensive. Your table would help, but introduce other costs (double indirection).
Chris Sells also discussed this on .NET Rocks. I think it was during his first appearance but the subject might have been revisited in later interviews.
http://www.dotnetrocks.com/default.aspx?showNum=10
My first reaction was: Why not? I can't imagine that you want to do is something as obscure as just leave an unreferenced chunk out on the heap to find it again later on. As if a four-byte pointer to the heap were too much to maintain to keep track of this chunk.
So the issue is not leaving unreferenced memory allocated, but intentionally disposing of memory still in reference. Since garbage collection performs the function of marking the memory free at some point, it seems that we should just be able to call an alternate sequence of instructions to dispose of this particular chunk of memory.
However, the problem lies here:
String s = "Here is a string.";
String t = s;
String u = s;
junk( s );
What do t and u point to? In a strict reference system, t and u should be null. So that means that you have to not only do reference counting, but perhaps tracking as well.
However, I can see that you should be done with s at this point in your code. So junk can set the reference to null, and pass it to the sweeper with a sort of priority code. The gc could be activated for a limited run, and the memory freed only if not reachable. So we can't explicitly free anything that somebody has coded to use in some way again. But if s is the only reference, then the chunk is deallocated.
So, I think it would only work with a limited adherence to the explicit side.
It's possible, and already implemented, in non-managed languages such as C++. Basically, you implement or use an existing garbage collector: when you want manual memory management, you call new and delete as normal, and when you want garbage collection, you call GC_MALLOC or whatever the function or macro is for your garbage collector.
See http://www.hpl.hp.com/personal/Hans_Boehm/gc/ for an example.
Since you were using C# as an example, maybe you only had in mind implementing manual memory management in a managed language, but this is to show you that the reverse is possible.
If the semantics of delete on a object's reference would make all other references referencing that object be null, then you could do it with 2 levels of indirection (1 more than you hint). Though note that while the underlying object would be destroyed, a fixed amount of information (enough to hold a reference) must be kept live on the heap.
All references a user uses would reference a hidden reference (presumably living in a heap) to the real object. When doing some operation on the object (such as calling a method or relying on its identity, wuch as using the == operator), the reference the programmer uses would dereference the hidden reference it points to. When deleting an object, the actual object would be removed from the heap, and the hidden reference would be set to null. Thus the references programmers would see evaluate to null.
It would be the GC's job to clean out these hidden references.
This would help in situations with long-lived objects. Garbage Collection works well when objects are used for short periods of time and de-referenced quickly. The problem is when some objects live for a long time. The only way to clean them up is to perform a resource-intensive garbage collection.
In these situations, things would work much easier if there was a way to explicitly delete objects, or at least a way to move a graph of objects back to generation 0.
Yes ... but with some abuse.
C# can be abused a little to make that happen.
If you're willing to play around with the Marshal class, StructLayout attribute and unsafe code, you could write your very own manual memory manager.
You can find a demonstration of the concept here: Writing a Manual Memory Manager in C#.