Does Free Pascal have a way to implement SHA256 or SHA512? - freepascal

In the Free Pascal libraries there's a hash library that enables use of MD5 and SHA1 hashing algorithms (http://wiki.freepascal.org/hash). But what if I wanted to use a higher one, such as SHA256 or SHA512? Could I achieve this using Free Pascal? Searching the FP Wiki retunrs zero hits for SHA256\SHA512.

In recent versions (say 2 years or so), there is a package "hash" with units "sha1" and "md5" that implement some basic hashes and checksums
If you need more, most people use DCPCrypt as it is easily converted
http://www.cityinthesky.co.uk/opensource/dcpcrypt
At least I see regularly posts on the lists that people are using it

In Google codesearch I found several units that implement it in pascal.
Query: sha256 | sha512 lang:pascal
One of the sources is from Double Commander, which is a norton/total commander clone that's developed with FreePascal and Lazarus, so there you go.

For other hashes I use "Delphi Encryption Compendium (DEC) 5.2". I don't know if it works with FPC, but you should try. There is THash_SHA512 and THash_SHA256.
Download it from: http://www.torry.net/pages.php?id=519#939342

Related

Request for Attribute based Encryption pseudocode or java code (Jpbc library)

I want to implement the bilinear pairing Attribute based Encryption using Jpbc library. However, I could not find any pseudocode or code. Could you help me to find the pseudocode or code of ABE. Thank you.
I'm not very sure about what you exactly want. JPBC is a pairing library for Java. ABE is an encryption cipher. One is a mathematical tool to build an encryption cipher; the other is a cipher found in already-built libraries. So I will try to address everything you say in your question:
I want to implement the bilinear pairing Attribute based Encryption using Jpbc library
If you need a pairing library in Java, jPBC is your best option although it is not very good. As the authors mention, MIRACL is a good non-java alternative. Charm can also be a good option, either to build your own ABE implementation (something I do not recommend unless you are a cryptographer with implementation experience) or use some of theirs.
Could you help me to find the pseudocode or code of ABE
If what you want is a Java Implementation of ABE schemes, there are the following libraries available on GitHub:
CP-ABE
DET-ABE
JCPABE
You should consider that these libraries have no maintenance or an active community.
Finally, in case you find it useful, and if you are willing to use other languages, some other ABE implementations are:
Charm (Python)
Rabe (Rust)
OpenABE (C++)

Good practices for app configuration storage?

We have a number of loosely coupled apps, some in PHP and some in Python.
It would be beneficial to have some centralized place where they could get both global and app-specific configuration information.
Something like, for Python:
conf=config_server.get_params(url='http://config_server/get/My_app/all', auth=my_auth_data)
and then ideally use parameters as potentially nested attributes, eg. conf.APP.URL, conf.GLOBAL.MAX_SALES
I was considering making my own config server app, but wasn't sure, what would be the pros and cons of such approach vs. eg. storing config in centralized database or any other multiple-site accessible mode.
Also, if I perhaps was missing some readily available tool with good support, which could do this (I had a look at Puppet and Ansible, but they seemed to be very evolved tools for doing so much more than this. I also looked at software recommnedation SE for this, but they have a number of such question unanswered already).
I think it would be a good idea for your configuration mechanism not to be hard-coded to obtain configuration data via a particular technology (such as file, web server or database), but rather be able to obtain configuration data from any of several different technologies. I illustrate this with the following pseudo-code examples:
cfg = getConfig("file.cfg"); # from a file
cfg = getConfig("file#file.cfg"); # also from a file
cfg = getConfig("url#http://config_server/file.cfg"); # from the specified URL
cfg = getConfig("exec#getConfigFromDB.py"); # from stdout of command
The parameter passed to getConfig() might be obtained from, say, a command-line option. The "exec#..." format is a flexible mechanism, but carries the potential danger of somebody specifying a malicious command to execute, for example, "exec#rm -rf /".
This approach means you can experiment with whatever you consider to be an ideal source-of-configuration-data technology and later, if you discover that technology to be inappropriate, it will be trivial to discard it and use a different source-of-configuration-data technology instead. Indeed, the decision for which source-of-configuration-data technology to use might vary from one use case/user to another.
I developed a C++ and Java configuration file parser (sorry, no Python or PHP implementations) called Config4*. If you look at chapters 2 (overview of syntax) and 3 (overview of API) of the Config4* Getting Started Guide, you will notice that it supports the kind of flexible approach I discuss in this answer (the "url#... format is not supported, but "exec#curl -sS ..." provides the same functionality). 99 percent of the time, I end up using configuration files, but I find it comforting to know that my applications can trivially switch to using a different-source-of-configuration-data technology whenever the need might arise.

Convert chinese characters to hanyu pinyin

How to convert from chinese characters to hanyu pinyin?
E.g.
你 --> Nǐ
马 --> Mǎ
More Info:
Either accents or numerical forms of hanyu pinyin are acceptable, the numerical form being my preference.
A Java library is preferred, however, a library in another language that can be put in a wrapper is also OK.
I would like anyone who has personally used such a library before to recommend or comment on it, in terms of its quality/ reliabilitty.
The problem of converting hanzi to pinyin is a fairly difficult one. There are many hanzi characters which have multiple pinyin representations, depending on context. Compare 长大 (pinyin: zhang da) to 长城 (pinyin: chang cheng). For this reason, single-character conversion is often actually useless, unless you have a system that outputs multiple possibilities. There is also the issue of word segmentation, which can affect the pinyin representation as well. Though perhaps you already knew this, I thought it was important to say this.
That said, the Adso Package contains both a segmenter and a probabilistic pinyin annotator, based on the excellent Adso library. It takes a while to get used to though, and may be much larger than you are looking for (I have found in the past that it was a bit too bulky for my needs). Additionally, there doesn't appear to be a public API anywhere, and its C++ ...
For a recent project, because I was working with place names, I simply used the Google Translate API (specifically, the unofficial java port, which, for common nouns at least, usually does a good job of translating to pinyin. The problem is commonly-used alternative transliteration systems, such as "HongKong" for what should be "XiangGang". Given all of this, Google Translate is pretty limited, but it offers a start. I hadn't heard of pinyin4j before, but after playing with it just now, I have found that it is less than optimal--while it outputs a list of potential candidate pinyin romanizations it makes no attempt to statistically determine their likelihood. There is a method to return a single representation, but it will soon be phased out, as it currently only returns the first romanization, not the most likely. Where the program seems to do well is with conversion between romanizations and general configurability.
In short then, the answer may be either any one of these, depending on what you need. Idiosyncratic proper nouns? Google Translate. In need of statistics? Adso. Willing to accept candidate lists without context information? Pinyin4j.
In Python try
from cjklib.characterlookup import CharacterLookup
cjk = CharacterLookup('C')
cjk.getReadingForCharacter(u'北', 'Pinyin')
You would get
['běi', 'bèi']
Disclaimer: I'm the author of that library.
For Java, I'd try the pinyin4j library
As mentioned in other answers the conversion is fuzzy and even google translate apparently gets a certain percentage of character combinations wrong.
A reasonable result which will not be 100% accurate can be achieved with open-source libraries available for some programming languages.
The simplest code to do the conversion with python with the pypinyin library (to install it use pip3 install pypinyin):
from pypinyin import pinyin
def to_pinyin(chin):
return ' '.join([seg[0] for seg in pinyin(chin)])
print(to_pinyin('好久不见'))
# OUTPUT: hǎo jiǔ bú jiàn
NOTE: The pinyin method from the module returns a list of possible candidate segments, and the to_pinyin method takes the first variant whenever more than one conversion is available. For tricky corner cases this is likely to produce incorrect results, but generally you'll probably get at least a ~90..95% success rate.
There are a few other python libraries for pinyin conversion but in my tests they proved to have a higher error rate than pypinyin. Also, they don't appear to be actively maintained.
If you need better accuracy then you'll need a more complex approach that will rely on bigger datasets and possibly some machine learning.

Cross-platform and language (de)serialization

I'm looking for a way to serialize a bunch of C++ structs in the most convenient way so that the serialization is portable across C++ and Java (at a minimum) and across 32bit/64bit, big/little endian platforms. The structures to be serialized just contain data, i.e. they're pure data objects with no state or behavior.
The idea being that we serialize the structs into an octet blob that we can store in a database "generically" and be read out later on. Thus avoiding changing the database whenever a struct changes and also avoiding assigning each data member to a field - i.e. we only want one table to hold everything "generically" as a binary blob. This should make less work for developers and require less changes when structures change.
I've looked at boost.serialize but don't think there's a way to enable compatibility with Java. And likewise for inheriting Serializable in Java.
If there is a way to do it by starting with an IDL file that would be best as we already have IDL files that describe the structures.
Cheers in advance!
I stumbled here, having a very similar question. 6 years later, this might not be useful to you, but hopefully it will be to others.
There are a lot of alternatives, unfortunately with no clear winner (although one could argue that JSON is the clear winner). Even Google has released multiple competing technologies (all of them apparently being used internally):
FlatBuffers: this one seems to meet the requirements from the original question, has interesting benchmarks and supports some form of IDL (I'm personally not familiar with IDL)
Protocol Buffers: mentioned previously.
XFJSON: 5%-12% smaller than JSON.
Not to forget the alternatives posted in the other answers. Here are a few more:
YAML: JSON minus all the double quotes, but using indentation instead. It's more human readable, but probably less efficient, especially as it gets larger.
BSON (Binary JSON)
MessagePack (Another compacted JSON)
With so many variations, JSON is clearly the winner in terms of simplicity/convenience and cross-platform access. It has gained even more popularity in the last couple years, with the rise of JavaScript. A lot of people probably use that as a de-facto solution, without giving it much thought (that's what I originally did :P).
However, if size becomes an issue, but you prefer to keep things simple and not use one of the more advanced libraries, you could just compress JSON using zlib (that's what I'm doing now), or some other cross-platform algorithm (but that's a whole other topic).
To speed up JSON handling in C++, you could also use RapidJSON.
I'm surprised Jon Skeet hasn't already pounced on this one :-)
Protocol Buffers is pretty much designed for this sort of scenario -- passing structured data cross-language.
That said, if you're using a database the way you suggest, you really shouldn't be using a full-strength RDBMS like Oracle or SQL Server but rather a lightweight key-value store such as Berkeley DB or one of the many "cloud table" engines.
If I want to go really really cross language, I normally would suggest JSON, as the ease of javascript support and an abundance of libraries, as well as being human readable and modifiable (I prefer it to XML as I find it smaller in terms of chars, faster, and more readable). It's not the most efficient in terms of space, however, and a more machine readable format like protocol buffers or thrift would have advantages there (thrift can be made from an IDL, but it is also made for encoding services, so it could be heavier than you want).
You need ASN.1! (Some people refer to this as binary XML.) ASN.1 is very compact and thus ideal to transfer data between two systems. And for those who don't think this is ever used: several Internet protocols are based upon the ASN.1 model for data serialization!
Unfortunately, there aren't many libraries available for Java or C++ that will support ASN.1. I had to work with it several years ago and just couldn't find a good, free or inexpensive tool to allow support for ASN.1 in C++. At Objective Systems they are selling ASN.1/XML solutions but it's extremely expensive. (The ASN.1 compiler for C++ and Java, that is!) It costs you an arm and a leg at least! (But then you will have a tool that you can use with only one hand...)
I'd suggest saving the data with SQLite database. The structs can be stored as database rows in SQLite tables.
The resulting database file is binary compatible across many different platforms and can be stored as a BLOB in your main database. I believe the file size is comparable to compressed XML file with the same data, but memory usage during processing will be significantly less than XML DOM.
Why haven't you chosen XML, as this perfectly suits your demand. Both C++ and Java allow for an easy implementation.
Furthermore, I doubt your idea of storing everything as a blob in the database, use a relational database what a database has been designed for, or switch to some object oriented database like http://www.versant.com/en_US/products/objectdatabase which supports both Java and C++.
There is also Avro. Look this question for comparison of Apache thrift, protocol buffers, mes and so on.

How can I analyze a closed format (e.g. doc or vce)?

I want to study the .vce format. It's a binary format and it seems more complicated than a simple object serialization. Does it exist any tool or technique to analyze a binary format?
You might need to "Reverse-Code-Engineer" a programm using this file format (http://www.openrce.org/). Tools used for this kind of analysis are: brain, disassembler (IDA Pro for example) and Debugger (OllyDBG for example). But beware - the way for successfull reverse engineering a file format is veeeeeerrry hard.
And reversing an application might be illegal depending on where you live!
You'll have to get a library that can read the format (or create one yourself).
Here is some of the microsoft office binary format specifications
I believe it would only be possible through some nasty reversed-engineering. It would be very useful to have access to application that uses mentioned format, so that you can generate few simple files and compare them in hex editor. You cannot get far with this method, but you might be able to figure out the header.
It would also be useful to study some binary format mechanisms, such as encryption and compression. If you're talking about Visual CertExam file format, than it is likely that useful data will be strongly encrypted.
My 2 cents:
Start by reversing the application reading the files themselves. Particularly android applications are helpful, as the resulting java source is easier to read (you might want to try A+ vce reader for android for example). This program indicates that vce uses/embeds sqlite in the file (in line with what is hinted here: Reverse Engineer a File Format).
Where to go from here? You might want to explore sqlite file carving tools to see if there might be a way to programatically identify the patterns in the file. Good luck!