Is there a standard way to add textual information in a TIFF file with FreeImage? - tiff

I have a program that generates its results as TIFF files. I would like to add some textual information in theses files to keep a trace of the program parameters.
I know that a tag named "ImageDescription" can be added in a tiff file (according to the specification file p34), if I could put the program parameters in that field, it will be ok for me.
But is it possible to set this tag with FreeImage?
If it's not possible, can I add EXIF information to my tiff file with FreeImage?

I answer my own question.
With FreeImage, one simple way to add metadata is to use IPTC tags:
void addTag(FIBITMAP *bitmap, const char *key, const char *value)
{
FITAG *tag = FreeImage_CreateTag();
size_t len = strlen(value)+1;
FreeImage_SetTagKey(tag, len);
FreeImage_SetTagCount(tag, len);
FreeImage_SetTagType(tag, FIDT_ASCII);
FreeImage_SetTagValue(tag, value);
FreeImage_SetMetadata(FIMD_IPTC, bitmap, FreeImage_GetTagKey(tag), tag);
FreeImage_DeleteTag(tag);
}
And, use these function with valid IPTC tags:
// set creator's name, limited to 32 bytes
addTag(bitmap, "By-line", "Creator's name");
// set keyword, limited to 64 bytes
addTag(bitmap, "Keywords", "Param1=foo;Param2=bar");

I would like to modify Mathieve's answer little bit. Otherwise it will not work.
void addTag(FIBITMAP *bitmap, const char *key, const char *value)
{
FITAG *tag = FreeImage_CreateTag();
if(tag)
{
size_t len = strlen(value)+1;
FreeImage_SetTagKey(tag, len);
//Tag length, tag count and tag type should be set before setting tag value
FreeImage_SetTagLength(tag, len);
FreeImage_SetTagCount(tag, len);
FreeImage_SetTagType(tag, FIDT_ASCII);
FreeImage_SetTagValue(tag, value);
FreeImage_SetMetadata(FIMD_IPTC, bitmap, FreeImage_GetTagKey(tag), tag);
//Delete tag after setting the metadata
FreeImage_DeleteTag(tag);
}
}
And, use these function with valid IPTC tags in FreeImage documentation:
// set creator's name, limited to 32 bytes
addTag(bitmap, "By-line", "Creator's name");
// set keyword, limited to 64 bytes
addTag(bitmap, "Keywords", "Param1=foo;Param2=bar");

Related

MFC: Using CHtmlView with memory string via about: or data:?

I am trying out CHtmlView to display html from memory variables. After having dealt with the various exceptions you get in debug mode, have it working for very small strings via the about: uri.
Example:
Navigate(_T("about:<html><head></head><body>Hello</body></html>"))
works for small items but not larger strings. Does anyone know the documented limitation for about: ?
Now I found a new item that supposed to be available for IE, the data: entry, but when I try
Navigate(_T("data:text/html, <html><head></head><body>Hello</body></html>"))
It doesn't work, comes up with the fancy webpage can't be displayed page. Does anyone know why CHtmlView doesn't support data: and if there is any other trick that can be used to use memory variable data for html display in CHtmlView?
One option for setting HTML content directly, is to read from memory using IStream
MFC's CHtmlEditCtrl uses a similar method to set document html content, except MFC uses CStreamOnCString.
You may need to set the content to UTF8 for compatibility. To use UTF8,
change CString to CStringA in the code below, and pass UTF8 string to the function SetHTMLContent(htmlview, u8"<html>...")
HRESULT SetHTMLContent(CHtmlView* htmlview, CString html)
{
if(!html.GetLength()) return E_FAIL;
CComPtr<IDispatch> disp = htmlview->GetHtmlDocument();
if(!disp)
{
//not initialized, try again
htmlview->Navigate(_T("about:"));
disp = htmlview->GetHtmlDocument();
if(!disp)
return E_NOINTERFACE;
}
CComQIPtr<IHTMLDocument2> doc2 = disp;
if(!doc2) return E_NOINTERFACE;
int charsize = sizeof(html.GetAt(0));
IStream *istream = SHCreateMemStream(
reinterpret_cast<const BYTE*>(html.GetBuffer()), charsize * html.GetLength());
HRESULT hr = E_FAIL;
if(istream)
{
CComQIPtr<IPersistStreamInit> psi = doc2;
if(psi)
hr = psi->Load(istream);
istream->Release();
}
html.ReleaseBuffer();
return hr;
}
Usage:
CString str = _T("<html><head></head><body>Hello</body></html>");
SetHTMLContent(m_chtmlview, str);

How to convert large UTF-8 encoded char* string to CStringW (UTF-16)?

I have a problem with converting a UTF-8 encoded string to a UTF-16 encoded CStringW.
Here is my source code:
CStringW ConvertUTF8ToUTF16( __in const CHAR * pszTextUTF8 )
{
_wsetlocale( LC_ALL, L"Korean" );
if ( (pszTextUTF8 == NULL) || (*pszTextUTF8 == '\0') )
{
return L"";
}
const size_t cchUTF8Max = INT_MAX - 1;
size_t cchUTF8;
HRESULT hr = ::StringCbLengthA( pszTextUTF8, cchUTF8Max, &cchUTF8 );
if ( FAILED( hr ) )
{
AtlThrow( hr );
}
++cchUTF8;
int cbUTF8 = static_cast<int>( cchUTF8 );
int cchUTF16 = ::MultiByteToWideChar(
CP_UTF8,
MB_ERR_INVALID_CHARS,
pszTextUTF8,
-1,
NULL,
0
);
CString strUTF16;
strUTF16.GetBufferSetLength(cbUTF8);
WCHAR * pszUTF16 = new WCHAR[cchUTF16];
int result = ::MultiByteToWideChar(
CP_UTF8,
0,
pszTextUTF8,
cbUTF8,
pszUTF16,
cchUTF16
);
ATLASSERT( result != 0 );
if ( result == 0 )
{
AtlThrowLastWin32();
}
strUTF16.Format(_T("%s"), pszUTF16);
return strUTF16;
}
pszTextUTF8 is htm file's content in UTF-8.
When htm file's volume is less than 500kb, this code works well.
but, when converting over 500kb htm file, (ex 648KB htm file that I have.)
pszUTF16 has all content of file, but strUTF16 is not. (about half)
I guess File open is not wrong.
In strUTF16 m_pszData has all content how to I get that?
strUTF16.Getbuffer(); dosen't work.
The code in the question is stock full of bugs, somewhere in the order of 1 bug per 1-2 lines of code.
Here is a short summary:
_wsetlocale( LC_ALL, L"Korean" );
Changing a global setting in a conversion function is unexpected, and will break code calling that. It's not even necessary either; you aren't using the locale for the encoding conversion.
HRESULT hr = ::StringCbLengthA( pszTextUTF8, cchUTF8Max, &cchUTF8 );
This is passing the wrong cchUTF8Max value (according to the documentation), and counts the number of bytes (vs. the number of characters, i.e. code units). Besides all that, you do not even need to know the number of code units, as you never use it (well, you are, but that is just another bug).
int cbUTF8 = static_cast<int>( cchUTF8 );
While that fixes the prefix (count of bytes as opposed to count of characters), it won't save you from using it later on for something that has an unrelated value.
strUTF16.GetBufferSetLength(cbUTF8);
This resizes the string object that should eventually hold the UTF-16 encoded characters. But it doesn't use the correct number of characters (the previous call to MultiByteToWideChar would have provided that value), but rather chooses a completely unrelated value: The number of bytes in the UTF-8 encoded source string.
But it doesn't just stop there, that line of code also throws away the pointer to the internal buffer, that was ready to be written to. Failure to call ReleaseBuffer is only a natural consequence, since you decided against reading the documentation.
WCHAR * pszUTF16 = new WCHAR[cchUTF16];
While not a bug in itself, it needlessly allocates another buffer (this time passing the correct size). You already allocated a buffer of the required size (albeit wrong) in the previous call to GetBufferSetLength. Just use that, that's what the member function is for.
strUTF16.Format(_T("%s"), pszUTF16);
That is probably the anti-pattern associated with the printf family of functions. It is the convoluted way to write CopyChars (or Append).
Now that that's cleared up, here is the correct way to write that function (or at least one way to do it):
CStringW ConvertUTF8ToUTF16( __in const CHAR * pszTextUTF8 ) {
// Allocate return value immediately, so that (N)RVO can be applied
CStringW strUTF16;
if ( (pszTextUTF8 == NULL) || (*pszTextUTF8 == '\0') ) {
return strUTF16;
}
// Calculate the required destination buffer size
int cchUTF16 = ::MultiByteToWideChar( CP_UTF8,
MB_ERR_INVALID_CHARS,
pszTextUTF8,
-1,
nullptr,
0 );
// Perform error checking
if ( cchUTF16 == 0 ) {
throw std::runtime_error( "MultiByteToWideChar failed." );
}
// Resize the output string size and use the pointer to the internal buffer
wchar_t* const pszUTF16 = strUTF16.GetBufferSetLength( cchUTF16 );
// Perform conversion (return value ignored, since we just checked for success)
::MultiByteToWideChar( CP_UTF8,
MB_ERR_INVALID_CHARS, // Use identical flags
pszTextUTF8,
-1,
pszUTF16,
cchUTF16 );
// Perform required cleanup
strUTF16.ReleaseBuffer();
// Return converted string
return strUTF16;
}

CGI wont display variables through HTML in c (Eclipse)

I have used a fifo pipe to read in some data (weather data) into a char variable. The console will display this variable correctly. However, when I try to display it through HTML on the CGI page, it simply does not display. Code below -
int main(void) {
int fd;
char *myfifo = "pressure.txt";
char buff[BUFFER];
long fTemp;
//open and read message
fd = open(myfifo, O_RDONLY);
read(fd, buff, BUFFER);
printf("Received: %s\n", buff);
close(fd);
printf("Content-type: text/html\n\n");
puts("<HTML>");
puts("<BODY>");
printf("Data is: %s", buff);
puts("</BODY>");
puts("</HTML>");
return EXIT_SUCCESS;
}
As you can see in the console is displays correctly -
Received: 2014-08-13 16:54:57
25.0 DegC, 1018.7 mBar
Content-type: text/html
<HTML>
<BODY>
Data is 2014-08-13 16:54:57
25.0 DegC, 1018.7 mBar
</BODY>
</HTML>
logout
But on the CGI webpage it does not display the weather data, but it does display "data is".
Two important things when writing a CGI program:
the program will be run by the webserver, which is normally
started as a different user (the 'www' user for example).
it's possible that the program is started from within another
directory, which can cause different behaviour if you don't
specify the full path of a file you want to open.
Since both these things can cause problems, it can be helpful
to add some debug information. Of course, it's always a good idea
to check return values of functions you use.
To make it easier to display debug or error messages, I'd first
move the following code up, so that all output that comes after
it will be rendered by the browser:
printf("Content-type: text/html\r\n\r\n");
puts("<HTML>");
puts("<BODY>");
It may be useful to know what the webserver uses as the directory
from which the program is started. The getcwd
call can help here. Let's use a buffer of size BUFFER to store
the result in, and check if it worked:
char curpath[BUFFER];
if (getcwd(curpath, BUFFER) == NULL)
printf("Can't get current path: %s<BR>\n", strerror(errno));
else
printf("Current path is: %s<BR>\n", curpath);
The getcwd function returns NULL in case of an error, and sets the value
of errno to a number which indicates what went wrong. To convert this
value to something readable, the strerror
function is used. For example, if BUFFER was not large enough to be
able to store the path, you'll see something like
Can't get current path: Numerical result out of range
The open call returns a negative number
if it didn't work, and sets errno again. So, to check if this worked:
fd = open(myfifo, O_RDONLY);
if (fd < 0)
printf("Can't open file: %s<BR>\n", strerror(errno));
In case the file can be found, but the webserver does not have permission
to open it, you'll see
Can't open file: Permission denied
If the program is started from another directory than you think, and
it's unable to locate the file, you would get:
Can't open file: No such file or directory
Adding such debug info should make it more clear what's going on, and more
importantly, what's going wrong.
To make sure the actual data is read without problems as well, the return
value of the read function should be
checked and appropriate actions should be taken. If read fails,
a negative number is returned. To handle this:
numread = read(fd, buff, BUFFER);
if (numread < 0)
printf("Error reading from file: %s<BR>\n", strerror(errno));
Another value indicates success, and returns the number of bytes that were
read. If really BUFFER bytes were read, it's not at all certain that the
last byte in buff is a 0, which is needed for printf to know when the
string ended. To make sure it is in fact null-terminated, the last byte in
buff is set to 0:
if (numread == BUFFER)
buff[BUFFER-1] = 0;
Note that this actually overwrites one of the bytes that were read in this
case.
If fewer bytes were read, it's still not certain that the last byte that was
read was a 0, but now we can place our own 0 after the bytes that were read
so none of them are overwritten:
else
buff[numread] = 0;
To make everything work, you may need the following additional include files:
#include <unistd.h>
#include <string.h>
#include <errno.h>
The complete code of what I described is shown below:
int main(void)
{
int fd, numread;
char *myfifo = "pressure.txt";
char buff[BUFFER];
char curpath[BUFFER];
long fTemp;
// Let's make sure all text output (even error/debug messages)
// will be visible in the web page
printf("Content-type: text/html\r\n\r\n");
puts("<HTML>");
puts("<BODY>");
// Some debug info: print the current path
if (getcwd(curpath, BUFFER) == NULL)
printf("Can't get current path: %s<BR>\n", strerror(errno));
else
printf("Current path is: %s<BR>\n", curpath);
// Open the file
fd = open(myfifo, O_RDONLY);
if (fd < 0)
{
// An error occurs, let's see what it is
printf("Can't open file: %s<BR>\n", strerror(errno));
}
else
{
// Try to read 'BUFFER' bytes from the file
numread = read(fd, buff, BUFFER);
if (numread < 0)
{
printf("Error reading from file: %s<BR>\n", strerror(errno));
}
else
{
if (numread == BUFFER)
{
// Make sure the last byte in 'buff' is 0, so that the
// string is null-terminated
buff[BUFFER-1] = 0;
}
else
{
// Fewer bytes were read, make sure a 0 is placed after
// them
buff[numread] = 0;
}
printf("Data is: %s<BR>\n", buff);
}
close(fd);
}
puts("</BODY>");
puts("</HTML>");
return EXIT_SUCCESS;
}

Convert an signed int to 8 digit hex in flex

How can i convert a type int into 8 digit hex decimal in flex
I need a function similiar in c# [ ToString("X8") ]. This function does the job in c#.
But what is the option in flex ?
As described in the docs, it's pretty much the same:
var myInt:int = 255;
var hex:String = myInt.toString(16);
trace(hex); //outputs "ff"
See http://help.adobe.com/en_US/FlashPlatform/reference/actionscript/3/int.html#toString()
If it's colors you're after: the docs describe how to handle that case too.
There is however no built-in way to add the leading zeros. You can use a method like this one to do that:
public function pad(s:String, pattern:String="0", minChars:int=8):String {
while (s.length < minChars) s = pattern + s;
return s;
}
trace(pad(hex)); //000000ff
Note: this is for 6 digit hex colors but could easily be modified to any number of hex digits.
Found a lot of ways of outputting padded hex values that relied heavily on string padding.
I wasn't really happy with any of those so this is what I came up with: (as a bonus it fits on one line) You could even shorten it by removing the toUpperCase() call as case is really irrelevant.
"0x"+ (i+0x1000000).toString(16).substr(1,6).toUpperCase()
If you want to floor or ceiling that to black and white and put that in a function:
public static function toHexColor(i:Number):String {
return i<0 ? "0x000000" : i>0xFFFFFF ? "0xFFFFFF" : "0x"+ (i+0x1000000).toString(16).substr(1,6).toUpperCase() ;
}
Here is a more expanded version with comments:
public static function toHexColor(i:Number):String {
//enforce ceiling and floor
if(i>0xFFFFFF){ return "0xFFFFFF";}
if(i<0){return "0x000000";}
//add the "magic" number
i += 0x1000000;
//append the 0x and strip the extra 1
return "0x"+ i.toString(16).substr(1,6).toUpperCase();
}

Methods for deleting blank (or nearly blank) pages from TIFF files

I have something like 40 million TIFF documents, all 1-bit single page duplex. In about 40% of cases, the back image of these TIFFs is 'blank' and I'd like to remove them before I do a load to a CMS to reduce space requirements.
Is there a simple method to look at the data content of each page and delete it if it falls under a preset threshold, say 2% 'black'?
I'm technology agnostic on this one, but a C# solution would probably be the easiest to support. Problem is, I've no image manipulation experience so don't really know where to start.
Edit to add: The images are old scans and so are 'dirty', so this is not expected to be an exact science. The threshold would need to be set to avoid the chance of false positives.
You probably should:
open each image
iterate through its pages (using Bitmap.GetFrameCount / Bitmap.SelectActiveFrame methods)
access bits of each page (using Bitmap.LockBits method)
analyze contents of each page (simple loop)
if contents is worthwhile then copy data to another image (Bitmap.LockBits and a loop)
This task isn't particularly complex but will require some code to be written. This site contains some samples that you may search for using method names as keywords).
P.S. I assume that all of images can be successfully loaded into a System.Drawing.Bitmap.
You can do something like that with DotImage (disclaimer, I work for Atalasoft and have written most of the underlying classes that you'd be using). The code to do it will look something like this:
public void RemoveBlankPages(Stream source stm)
{
List<int> blanks = new List<int>();
if (GetBlankPages(stm, blanks)) {
// all pages blank - delete file? Skip? Your choice.
}
else {
// memory stream is convenient - maybe a temp file instead?
using (MemoryStream ostm = new MemoryStream()) {
// pulls out all the blanks and writes to the temp stream
stm.Seek(0, SeekOrigin.Begin);
RemoveBlanks(blanks, stm, ostm);
CopyStream(ostm, stm); // copies first stm to second, truncating at end
}
}
}
private bool GetBlankPages(Stream stm, List<int> blanks)
{
TiffDecoder decoder = new TiffDecoder();
ImageInfo info = decoder.GetImageInfo(stm);
for (int i=0; i < info.FrameCount; i++) {
try {
stm.Seek(0, SeekOrigin.Begin);
using (AtalaImage image = decoder.Read(stm, i, null)) {
if (IsBlankPage(image)) blanks.Add(i);
}
}
catch {
// bad file - skip? could also try to remove the bad page:
blanks.Add(i);
}
}
return blanks.Count == info.FrameCount;
}
private bool IsBlankPage(AtalaImage image)
{
// you might want to configure the command to do noise removal and black border
// removal (or not) first.
BlankPageDetectionCommand command = new BlankPageDetectionCommand();
BlankPageDetectionResults results = command.Apply(image) as BlankPageDetectionResults;
return results.IsImageBlank;
}
private void RemoveBlanks(List<int> blanks, Stream source, Stream dest)
{
// blanks needs to be sorted low to high, which it will be if generated from
// above
TiffDocument doc = new TiffDocument(source);
int totalRemoved = 0;
foreach (int page in blanks) {
doc.Pages.RemoveAt(page - totalRemoved);
totalRemoved++;
}
doc.Save(dest);
}
You should note that blank page detection is not as simple as "are all the pixels white(-ish)?" since scanning introduces all kinds of interesting artifacts. To get the BlankPageDetectionCommand, you would need the Document Imaging package.
Are you interested in shrinking the files or just want to avoid people wasting their time viewing blank pages? You can do a quick and dirty edit of the files to rid yourself of known blank pages by just patching the second IFD to be 0x00000000. Here's what I mean - TIFF files have a simple layout if you're just navigating through the pages:
TIFF Header (4 bytes)
First IFD offset (4 bytes - typically points to 0x00000008)
IFD:
Number of tags (2-bytes)
{individual TIFF tags} (12-bytes each)
Next IFD offset (4 bytes)
Just patch the "next IFD offset" to a value of 0x00000000 to "unlink" pages beyond the current one.