MFC: Using CHtmlView with memory string via about: or data:? - html

I am trying out CHtmlView to display html from memory variables. After having dealt with the various exceptions you get in debug mode, have it working for very small strings via the about: uri.
Example:
Navigate(_T("about:<html><head></head><body>Hello</body></html>"))
works for small items but not larger strings. Does anyone know the documented limitation for about: ?
Now I found a new item that supposed to be available for IE, the data: entry, but when I try
Navigate(_T("data:text/html, <html><head></head><body>Hello</body></html>"))
It doesn't work, comes up with the fancy webpage can't be displayed page. Does anyone know why CHtmlView doesn't support data: and if there is any other trick that can be used to use memory variable data for html display in CHtmlView?

One option for setting HTML content directly, is to read from memory using IStream
MFC's CHtmlEditCtrl uses a similar method to set document html content, except MFC uses CStreamOnCString.
You may need to set the content to UTF8 for compatibility. To use UTF8,
change CString to CStringA in the code below, and pass UTF8 string to the function SetHTMLContent(htmlview, u8"<html>...")
HRESULT SetHTMLContent(CHtmlView* htmlview, CString html)
{
if(!html.GetLength()) return E_FAIL;
CComPtr<IDispatch> disp = htmlview->GetHtmlDocument();
if(!disp)
{
//not initialized, try again
htmlview->Navigate(_T("about:"));
disp = htmlview->GetHtmlDocument();
if(!disp)
return E_NOINTERFACE;
}
CComQIPtr<IHTMLDocument2> doc2 = disp;
if(!doc2) return E_NOINTERFACE;
int charsize = sizeof(html.GetAt(0));
IStream *istream = SHCreateMemStream(
reinterpret_cast<const BYTE*>(html.GetBuffer()), charsize * html.GetLength());
HRESULT hr = E_FAIL;
if(istream)
{
CComQIPtr<IPersistStreamInit> psi = doc2;
if(psi)
hr = psi->Load(istream);
istream->Release();
}
html.ReleaseBuffer();
return hr;
}
Usage:
CString str = _T("<html><head></head><body>Hello</body></html>");
SetHTMLContent(m_chtmlview, str);

Related

Putting C++ string in HTML code to show value on webserver

I've set up a webserver running on ESP8266 thats currently hosting 7 sites. The sites is written in plain HTML in each diffrent tab in the arduino ide. I have installed the library Pagebuilder to help with making everything look nice and run.
Except one thing. I have a button connected to my ESP8266 which by the time being imitates a sensor input. basicly when the button is pressed my integer "x" increments with 1. I also managed to make a string that replicates "x" and increments with the same value.
I also have a problem with Printing the IPadresse of the server, but thats not as important as the other.
My plan then was writing the string "score" (which contains x) into the HTML tab where it should be output. this obviously didnt work.
Things I've tried:
Splitting up the HTML code where I want the string to be printed and using client.println("");
This didnt work because the two libraries does not cooperate and WiFiClient does not find Pagebuilders server. (basicly, the client.println does nothing when I used it with Pagebuilder).
Reconstructing the HTML page as a literal really long string, and adding in the String with x like this: "html"+score+"html" and adding it into where the HTML page const char were. (basicly replacing the variable with the text that were in the variable).
This did neighter work because the argument "PageElement" from Pagebuilder does only expect one string, and errors out because theres an additional string inside the HTML string.
I've tried sending it as a post req. but this did not output the value either.
I have run out of Ideas to try.
//root page
#if defined(ARDUINO_ARCH_ESP8266)
#include <ESP8266WiFi.h>
#include <ESP8266WebServer.h>
#include <WiFiClient.h>
#elif defined(ARDUINO_ARCH_ESP32)
#include <WiFi.h>
#include <WebServer.h>
#endif
#include "PageBuilder.h"
#include "currentGame.h" //tab 1
#if defined(ARDUINO_ARCH_ESP8266)
ESP8266WebServer Server;
ESP8266WebServer server;
#endif
int sensorPin = 2; // button input
int sensorValue = 0;
int x = 0; // the int x
String score=""; //the string x will be in
PageElement CURRENT_GAME_ELEMENT(htmlPage1);
PageBuilder CURRENT_GAME("/current-game", {CURRENT_GAME_ELEMENT}); // this //only showes on href /current-game
void button() {
sensorValue = analogRead(sensorPin); //read the voltage
score="Team 1: "+String((int)x+1); //"make" x a string
if (sensorValue <= 10) { // check if button is pressed
x++; // increment x
Serial.println(x);
Serial.println(score);
delay(100);
}
}
void setup() {
Serial.begin(115200);
pinMode(2, INPUT);
WiFi.softAP("SSID", "PASS");
delay(100);
CURRENT_GAME.insert(Server);
Server.begin();
}
void loop() {
Server.handleClient();
button();
}
// tab 1
const char htmlPage1[] PROGMEM = R"=====(
/*
alot of HTML, basicly the whole website...
..............................................
*/
<div class="jumbotron">
<div align="center">
<h1 class="display-4"> score </h1> // <--- this is where
//I want to print the
//string:
</div>
</div>
)=====";
what I want to do is getting the value of the string score displayed on the website. If I put "score" directly into the HTML, the word score will be displayed, not the value. I want the value displayed.
Edit:
I have figured out how to make the string(score) be printed in the HTML code, thus, I only have to convert the HTML code string back to a char. explanation is in comment below.
Edit 2: (-------------------------solution-------------------------)
Many thanks for the help I've gotten and sorry for being so ignorant, its just so hard being so close and that thing doesnt work. but anyways, What I did was following Pagebuilders example, and making another element to print in current game..
String test(PageArgument& args) {
return score;
}
const char html[] = "<div class=\"jumbotron\"><div align=\"center\"><h1 class=\"display-4\">{{NAME}}</h1></div></div>";
PageElement FRAMEWORK_PAGE_ELEMENT(htmlPage0);
PageBuilder FRAMEWORK_PAGE("/", {FRAMEWORK_PAGE_ELEMENT});
PageElement body_elem(html, { {"NAME", test} });
PageElement CURRENT_GAME_ELEMENT(htmlPage1);
PageBuilder CURRENT_GAME("/current-game", { CURRENT_GAME_ELEMENT, body_elem});
suprisingly easy when I first understood it.. Thanks again.
You could try building your string first, then converting it to a const char
like this: const char * c = str.c_str();
if you can't use a pointer you could try this:
string s = "yourHTML" + score + "moreHTML";
int n = s.length();
char char_array[n + 1];
strcpy(char_array, s.c_str());
additionally you could try the stringstream standard library
This sort of thing is often done using magic tags in your markup that are detected by the server code before it serves the HTML and filled in by executing some sort of callback or filling in a variable, or whatever.
So with this in mind and hoping for the best, I nipped over to: PageBuilder on github and looked to see if there was something similar here. Good news! In one of the examples:
const char html[] = "hello <b>{{NAME}}</b>, <br>Good {{DAYTIME}}.";
...
PageElement body_elem(html, { {"NAME", AsName}, {"DAYTIME", AsDayTime} });
Where {{NAME}} and {{DAYTIME}} are magic tokens. AsName and AsDayTime are functions to be called when the respective tag is encountered while the page is being served.
EDIT: in response to a request to explain differently, I'm not convinced I can do a better job of explaining the code than the example on the library's own github page, so I'll try a wordy description instead:
When you want to serve a webpage to a client, the code needs to know what you want to serve. In the simplest case, it's a static page: the same every time. You can just write the HTML, stick it in a string an be done.
whole_page = "<html>My fixed content</html>";
webserver.serve(whole_page);
But you want some dynamic element(s). As noted, you can do it in a few ways, such as serving some static HTML, then the dynamic bit, then some more static HTML. It seems you've not had much luck like this, and it's rather clunky anyway.
Or you can pre-build a new string out of the three bits and serve that in one chunk, but that's also pretty clunky.
(Aside: taking big strings and adding them together is likely to be slow and memory intensive, two things you really don't want on a little CPU like the ESP8266).
So instead, you allow 'magic' markers in the HTML, using a marker in place of the dynamic content, and serve that instead.
whole_page = "<html>My dynamic content. Value is {{my_value}}</html>";
webserver.serve(whole_page, ...);
The clever bit is that as the page is being served, the webserver is watching the text go by, and when it sees a magic tag, it stops and asks you to fill in the blank, then carries on as before.
Obviously, there is some processing overhead with watching for tags, and some programming overhead with telling it what tags to watch for and how to ask you for the data it needs.
I got advice from a friend who told me I should make a unique argument where I wanted the string(x) and then using some syntax to replace it. I also took inspiration from you Jelle..
what I did was make a unique argument "VAR_CURRENT_SCORE" put that into the HTML where I want the score output, then convert htmlPage1 from a char to a string, use string.replace() and replace "VAR_CURRENT_SCORE" with the string(x) score. this workes as I can see in the serial monitor output.
This is what I did:
//root page
String HTMLstring(htmlstringPage);
delay(100);
HTMLstring.replace("VAR_CURRENT_SCORE", score);
delay(50);
Serial.println("string:");
Serial.println(HTMLstring);
//tab 1 char htmlstringPage[] PROGMEM = R"=====(
<div class="jumbotron">
<div align="center">
<h1 class="display-4">VAR_CURRENT_SCORE</h1>
</div>
</div>
)=====";
However, I still have a small problem left which is converting the string back to char to post it to the website.
To convert the string back:
request->send_P(200, "text/html", HTMLstring.c_str());

Equivalent of Platform::IBoxArray in C++/WinRT

I am currently porting an UWP application from C++/CX to C++/WinRT. I encountered a safe_cast<Platform::IBoxArray<byte>^>(data) where data is of type Windows::Foundation::IInspectable ^.
I know that the safe_cast is represented by the as<T> method, and I know there are functions for boxing (winrt::box_value) and unboxing (winrt::unbox_value) in WinRT/C++.
However, I need to know the equivalent of Platform::IBoxArray in order to perform the cast (QueryInterface). According to https://learn.microsoft.com/de-de/cpp/cppcx/platform-iboxarray-interface?view=vs-2017, IBoxArray is the C++/CX equivalent of Windows::Foundation::IReferenceArray, but there is no winrt::Windows::Foundation::IReferenceArray...
Update for nackground: What I am trying to achieve is retrieving the view transform attached by the HoloLens to every Media Foundation sample from its camera. My code is based on https://github.com/Microsoft/HoloLensForCV, and I got really everything working except for this last step. The problem is located around this piece of code:
static const GUID MF_EXTENSION_VIEW_TRANSFORM = {
0x4e251fa4, 0x830f, 0x4770, 0x85, 0x9a, 0x4b, 0x8d, 0x99, 0xaa, 0x80, 0x9b
};
// ...
// In the event handler, which receives const winrt::Windows::Media::Capture::Frames::MediaFrameReader& sender:
auto frame = sender.TryAcquireLatestFrame();
// ...
if (frame.Properties().HasKey(MF_EXTENSION_VIEW_TRANSFORM)) {
auto /* IInspectable */ userData = frame.Properties().Lookup(MF_EXTENSION_VIEW_TRANSFORM);
// Now I would have to do the following:
// auto userBytes = safe_cast<Platform::IBoxArray<Byte> ^>(userData)->Value;
//viewTransform = *reinterpret_cast<float4x4 *>(userBytes.Data);
}
I'm also working on porting some code from HoloLensForCV to C++/WinRT. I came up with the following solution for a very similar case (but not the exact same line of code you ask about):
auto user_data = source.Info().Properties().Lookup(c_MF_MT_USER_DATA); // type documented as 'array of bytes'
auto source_name = user_data.as<Windows::Foundation::IReferenceArray<std::uint8_t>>(); // Trial and error to get the right specialization of IReferenceArray
winrt::com_array<std::uint8_t> arr;
source_name.GetUInt8Array(arr);
winrt::hstring source_name_str{ reinterpret_cast<wchar_t*>(arr.data()) };
Specifically, you can replace the safe_cast with .as<Windows::Foundation::IReferenceArray<std::uint8_t> for a boxed array of bytes. Then, I suspect doing the same cast as me (except to float4x4* instead of wchar_t*) will work for you.
The /ZW flag is not required for my example above.
I can't believe that actually worked, but using information from https://learn.microsoft.com/de-de/windows/uwp/cpp-and-winrt-apis/interop-winrt-cx, I came up with the following solution:
Enable "Consume Windows Runtime Extension" via /ZW and use the following conversion:
auto abi = reinterpret_cast<Platform::Object ^>(winrt::get_abi(userData));
auto userBytes = safe_cast<Platform::IBoxArray<byte> ^>(abi)->Value;
viewTransform = *reinterpret_cast<float4x4 *>(userBytes->Data);
Unfortunately, the solution has the drawback of generating
warning C4447: 'main' signature found without threading model. Consider using 'int main(Platform::Array^ args)'.
But for now, I can live with it ...

aws-sdk-cpp: how to use CurlHttpClient?

I need to make signed requests to AWS ES, but am stuck at the first hurdle in that I cannot seem to be able to use CurlHttpClient. Here is my code (verb, path, and body defined elsewhere):
Aws::Client::ClientConfiguration clientConfiguration;
clientConfiguration.scheme = Aws::Http::Scheme::HTTPS;
clientConfiguration.region = Aws::Region::US_EAST_1;
auto client = Aws::MakeShared<Aws::Http::CurlHttpClient>(ALLOCATION_TAG, clientConfiguration);
Aws::Http::URI uri;
uri.SetScheme(Aws::Http::Scheme::HTTPS);
uri.SetAuthority(ELASTIC_SEARCH_DOMAIN);
uri.SetPath(path);
Aws::Http::Standard::StandardHttpRequest req(uri, verb);
req.AddContentBody(body);
auto res = client->MakeRequest(req);
Aws::Http::HttpResponseCode resCode = res->GetResponseCode();
if (resCode == Aws::Http::HttpResponseCode::OK) {
Aws::IOStream &body = res->GetResponseBody();
rejoiceAndBeMerry();
}
else {
gotoPanicStations();
}
When executed, the code throws a bad_function_call deep from within the sdk mixed up with a lot of shared_ptr this and allocate that. My guess is that I am just using the SDK wrong, but I've been unable to find any examples that use the CurlHttpClient directly such as I need to do here.
How can I use CurlHttpClient?
You shouldn't be using the HTTP client directly, but the supplied wrappers with the aws-cpp-sdk-es package. Like previous answer(s), I would recommend evaluating the test cases shipped with the library to see how the original authors intended to implement the API (at least until the documents catch-up).
How can I use CurlHttpClient?
Your on the right track with managed shared resources and helper functions. Just need to create a static factory/client to reference. Here's a generic example.
using namespace Aws::Client;
using namespace Aws::Http;
static std::shared_ptr<HttpClientFactory> MyClientFactory; // My not be needed
static std::shared_ptr<HttpClient> MyHttpClient;
// ... jump ahead to method body ...
ClientConfiguration clientConfiguration;
MyHttpClient = CreateHttpClient(clientConfiguration);
Aws::String uri("https://example.org");
std::shared_ptr<HttpRequest> req(
CreateHttpRequest(uri,
verb, // i.e. HttpMethod::HTTP_POST
Utils::Stream::DefaultResponseStreamFactoryMethod));
req.AddContentBody(body); //<= remember `body' should be `std::shared_ptr<Aws::IOStream>',
// and can be created with `Aws::MakeShared<Aws::StringStream>("")';
req.SetContentLength(body_size);
req.SetContentType(body_content_type);
std::shared_ptr<HttpResponse> res = MyHttpClient->MakeRequest(*req);
HttpResponseCode resCode = res->GetResponseCode();
if (resCode == HttpResponseCode::OK) {
Aws::StringStream resBody;
resBody << res->GetResponseBody().rdbuf();
rejoiceAndBeMerry();
} else {
gotoPanicStations();
}
I encountered exactly the same error when trying to download from S3 using CurlHttpClient.
I fixed it by instead modelling my code after the integration test found in the cpp sdk:
aws-sdk-cpp/aws-cpp-sdk-s3-integration-tests/BucketAndObjectOperationTest.cpp
Search for the test called TestObjectOperationsWithPresignedUrls.

LibXML C++ XPathEval Errors

For starters, I'm seeing two types of problems with my the functionality of the code. I can't seem to find the correct element with the function xmlXPathEvalExpression. In addition, I am receiving errors similar to:
HTML parser error : Unexpected end tag : a
This happens for what appears to be all tags in the page.
For some background, the HTML is fetched by CURL and fed into the parsing function immediately after. For the sake of debugging, the return statements have been replaced with printf.
std::string cleanHTMLDoc(std::string &aDoc, std::string &symbolString) {
std::string ctxtID = "//span[id='" + symbolString + "']";
htmlDocPtr doc = htmlParseDoc((xmlChar*) aDoc.c_str(), NULL);
xmlXPathContextPtr context = xmlXPathNewContext(doc);
xmlXPathObjectPtr result = xmlXPathEvalExpression((xmlChar*) ctxtID.c_str(), context);
if (xmlXPathNodeSetIsEmpty(result->nodesetval)) {
xmlXPathFreeObject(result);
xmlXPathFreeContext(context);
xmlFreeDoc(doc);
printf("[ERR] Invalid XPath\n");
return "";
}
else {
int size = result->nodesetval->nodeNr;
for (int i = size - 1; i >= 0; --i) {
printf("[DBG] %s\n", result->nodesetval->nodeTab[i]->name);
}
return "";
}
}
The parameter aDoc contains the HTML of the page, and symbolString contains the id of the item we're looking for; in this case yfs_l84_aapl. I have verified that this is an element on the page in the style span[id='yfs_l84_aapl'] or <span id="yfs_l84_aapl">.
From what I've read, the errors fed out of the HTML Parser are due to a lack of a namespace, but when attempting to use the XHTML namespace, I've received the same error. When instead using htmlParseChunk to write out the DOM tree, I do not receive these errors due to options such as HTML_PARSE_NOERROR. However, the htmlParseDoc does not accept these options.
For the sake of information, I am compiling with Visual Studio 2015 and have successfully compiled and executed programs with this library before. My apologies for the poorly formatted code. I recently switched from writing Java in Eclipse.
Any help would be greatly appreciated!
[Edit]
It's not a pretty answer, but I made what I was looking to do work. Instead of looking through the DOM by my (assumed) incorrect XPath expression, I moved through tag by tag to end up where I needed to be, and hard-coded in the correct entry in the nodeTab attribute of the nodeSet.
The code is as follows:
std::string StockIO::cleanHTMLDoc(std::string htmlInput) {
std::string ctxtID = "/html/body/div/div/div/div/div/div/div/div/span/span";
xmlChar* xpath = (xmlChar*) ctxtID.c_str();
htmlDocPtr doc = htmlParseDoc((xmlChar*) htmlInput.c_str(), NULL);
xmlXPathContextPtr context = xmlXPathNewContext(doc);
xmlXPathObjectPtr result = xmlXPathEvalExpression(xpath, context);
if (xmlXPathNodeSetIsEmpty(result->nodesetval)) {
xmlXPathFreeObject(result);
xmlXPathFreeContext(context);
xmlFreeDoc(doc);
printf("[ERR] Invalid XPath\n");
return "";
}
else {
xmlNodeSetPtr nodeSet = result->nodesetval;
xmlNodePtr nodePtr = nodeSet->nodeTab[1];
return (char*) xmlNodeListGetString(doc, nodePtr->children, 1);
}
}
I will leave this question open in hopes that someone will help elaborate upon what I did wrong in setting up my XPath expression.

Methods for deleting blank (or nearly blank) pages from TIFF files

I have something like 40 million TIFF documents, all 1-bit single page duplex. In about 40% of cases, the back image of these TIFFs is 'blank' and I'd like to remove them before I do a load to a CMS to reduce space requirements.
Is there a simple method to look at the data content of each page and delete it if it falls under a preset threshold, say 2% 'black'?
I'm technology agnostic on this one, but a C# solution would probably be the easiest to support. Problem is, I've no image manipulation experience so don't really know where to start.
Edit to add: The images are old scans and so are 'dirty', so this is not expected to be an exact science. The threshold would need to be set to avoid the chance of false positives.
You probably should:
open each image
iterate through its pages (using Bitmap.GetFrameCount / Bitmap.SelectActiveFrame methods)
access bits of each page (using Bitmap.LockBits method)
analyze contents of each page (simple loop)
if contents is worthwhile then copy data to another image (Bitmap.LockBits and a loop)
This task isn't particularly complex but will require some code to be written. This site contains some samples that you may search for using method names as keywords).
P.S. I assume that all of images can be successfully loaded into a System.Drawing.Bitmap.
You can do something like that with DotImage (disclaimer, I work for Atalasoft and have written most of the underlying classes that you'd be using). The code to do it will look something like this:
public void RemoveBlankPages(Stream source stm)
{
List<int> blanks = new List<int>();
if (GetBlankPages(stm, blanks)) {
// all pages blank - delete file? Skip? Your choice.
}
else {
// memory stream is convenient - maybe a temp file instead?
using (MemoryStream ostm = new MemoryStream()) {
// pulls out all the blanks and writes to the temp stream
stm.Seek(0, SeekOrigin.Begin);
RemoveBlanks(blanks, stm, ostm);
CopyStream(ostm, stm); // copies first stm to second, truncating at end
}
}
}
private bool GetBlankPages(Stream stm, List<int> blanks)
{
TiffDecoder decoder = new TiffDecoder();
ImageInfo info = decoder.GetImageInfo(stm);
for (int i=0; i < info.FrameCount; i++) {
try {
stm.Seek(0, SeekOrigin.Begin);
using (AtalaImage image = decoder.Read(stm, i, null)) {
if (IsBlankPage(image)) blanks.Add(i);
}
}
catch {
// bad file - skip? could also try to remove the bad page:
blanks.Add(i);
}
}
return blanks.Count == info.FrameCount;
}
private bool IsBlankPage(AtalaImage image)
{
// you might want to configure the command to do noise removal and black border
// removal (or not) first.
BlankPageDetectionCommand command = new BlankPageDetectionCommand();
BlankPageDetectionResults results = command.Apply(image) as BlankPageDetectionResults;
return results.IsImageBlank;
}
private void RemoveBlanks(List<int> blanks, Stream source, Stream dest)
{
// blanks needs to be sorted low to high, which it will be if generated from
// above
TiffDocument doc = new TiffDocument(source);
int totalRemoved = 0;
foreach (int page in blanks) {
doc.Pages.RemoveAt(page - totalRemoved);
totalRemoved++;
}
doc.Save(dest);
}
You should note that blank page detection is not as simple as "are all the pixels white(-ish)?" since scanning introduces all kinds of interesting artifacts. To get the BlankPageDetectionCommand, you would need the Document Imaging package.
Are you interested in shrinking the files or just want to avoid people wasting their time viewing blank pages? You can do a quick and dirty edit of the files to rid yourself of known blank pages by just patching the second IFD to be 0x00000000. Here's what I mean - TIFF files have a simple layout if you're just navigating through the pages:
TIFF Header (4 bytes)
First IFD offset (4 bytes - typically points to 0x00000008)
IFD:
Number of tags (2-bytes)
{individual TIFF tags} (12-bytes each)
Next IFD offset (4 bytes)
Just patch the "next IFD offset" to a value of 0x00000000 to "unlink" pages beyond the current one.