Displaying a .txt file as pure text without editing the file contents - html

I have a .txt file containing code that I cannot change the contents of. And I need to display it in two ways.
One way is inside a div as selectable, copy-able type (currently done with:
<pre><?php include '/file_location.txt';?></pre> ).
The other way is as a direct link to the .txt file so such link can have it's address copied and emailed to someone, saved as..., or any other function one might like a direct link for. (So just like <a href="/file_location.txt"> basically.)
The issue is that when php including the text file into a div any <%> strings interfere with the original source text. I need to preserve the integrity of the original .txt files (so I can't go changing all the left carrots into <).
So is there a good way to display the contents of the text file without issues with < > and still maintain it's original integrity for sake of direct-linking?
EDIT:
I currently have two separate files performing this function, one with html encodings and the raw unedited .txt file. I'd really like to get these two displays working with just one file so that each new bit of source code doesn't need to be converted to an html-friendly version and adding just its .txt file will grant both view options.
EDIT 2:
Using <textarea> instead of <pre> will not interfere with the < characters and i could CSS it to look how I want, but I don't like the idea of the user being able to resize it themselves.

You can use
<?php echo htmlspecialchars(file_get_contents("file.txt")) ?>
instead of
<?php include '/file_location.txt';?>
to display special HTML characters from a text file.

I am using this for my php files. I think it will be usefull for you too.
<?php
highlight_file("test.php");
?>
edit: I tried on a html file and it worked.

I would try to add <pre></pre> at the beginning/end of your .txt. I'm not sure if I fully understand your question, but I think this will not interfere with the <> tags.

Related

Is it possible in HTML to separate a comma-delimited .txt file into seperate columns?

The text file has thousands of devices, the text file needs to be read and put into two seperate columns
text file name is servers.txt, ive tried searching on stackoverflow already and could not get it to work properly.
Example text is:
test,server
test1,server
test2,server
If you can do this outside of HTML, try the following.
Copy your data into a spreadsheet (the 'Original' column below):
The formula for 'Left' is:
=LEFT(K6,SEARCH(",",K6)-1)
The formula for 'Right' is:
=MID(K6, SEARCH(",",K6) + 1,LEN(K6))
You can't do this "in HTML." As mentioned in the comments, you'll need a server-side programming language, or possibly JavaScript, or do it by hand, as above, and copy it back wherever you need it.
The question is a bit unclear so it's hard to tell what will work the best.

Why is "#" breaking my csv? HTML anchor download

I have an HTML anchor link downloading a csv file. The problem I'm facing is that if there is a '#' symbol in the csv string (in the 'href' attribute) it is breaking the file at that point.
Is it breaking the file for some reason or is it reading it as the end of the file? I really don't know. I've tried escaping the character ('\#' instead of '#') but it just inputs the '\' and then breaks.
Thanks
****EDIT****
I am asking because I am trying to program this to work and am concerned that because certain fields may allow user input (hence, I cannot just avoid using '#') I may face this issue.
I am simply using html and a bit of javascript to create the string to download.
The '#' is breaking the file at that point. No new line, etc, everything after and including the '#' is not going into the file. I have viewed the file using Excel and also basic Notepad on windows. Thanks
****EDIT/EXAMPLE****
Sorry, the information is sensitive but I have included a simple example here.
This works fine:
<a class="download-link" download="test.csv" href="data:text/csv;charset=utf-8,col1h,col2h,col3h
col1data,col2data,col3data">Downloads as expected</a>
This does not. Due to the '#' in 'col1data', the file, when downloaded, ends at 'col1datahasa'. The headers go in fine but everything after and including the '#' does not.
<a class="download-link" download="test.csv" href="data:text/csv;charset=utf-8,col1h,col2h,col3h
col1datahasa#,col2datadontgetshown,col3datadontgetshown">Downloads with missing data</a>
If you copy the two links to a local file and run it you should see what I mean.

Regex match and delete everything before string (opening html tag)

I'm using Dreamweaver and Notepad++ and have searched high and low but nothing seems to work from what I've found.
I've got a whole stack of html pages and I need to remove from all of them everything above but not including the first tag in the document. Specifically, everything before the string "<h1" (no quotes). I've tried various examples in Notepad++ and it finds the first h1 tag but doesn't replace everthing before it.
Assuming you want to lose everything in your file before the "<h1" text
then specify ".*<[hH]1" as search tag and "<h1" as replacement and check
the box marked ". matches newline". Works for me.
You can do this from the Command Line or a text editor that allows you to search-replace multiple files. However, are you sure the content is the same in every html file?

convert pdf into small chunks of data(many chunks per page)?

I have a pdf file and I need to get get small pieces of data from it.
It is structured like this :
Page1:
Question 1
......................................
......................................
Question 2
......................................
......................................
Page End
I want to get Question 1 and Question 2 as separate html files, which contain text and image.
I've tried
pdftohtml -c pdffile.pdf output.html
And I got files with png images, but how to do I cut the Image into smaller chunks to fit the size of each Question (I want to separate each question into individual files)?
P.S. I have alot of pdf files, so a command-line tool would be nice.
I'll try to give you an approach on how I would go about it. You mention, that every page in your PDF document might have multiple questions and you basically want have one HTML file for every question.
It's great if pdftohtml works for you, but I also found another decent command line utility that you might want to try out.
Ok, so assuming you have an HTML file converted from the PDF you initially had, you might want to use csplit or awk to split your file into multiple files based on the delimiter 'Question' in your case. (Side note- csplit and awk are linux specific utilites, but I'm sure there are alternatives if you are on Windows or a MAC. I haven't specifically tried the following code)
From a relevant SO Post :
csplit input.txt'/^Question$/' '{*}'
awk '/Question/{filename=NR".txt"}; {print >filename}' input.txt
So, assuming this works, you will have a couple of broken html files. Broken because they'll be unsanitized due to dangling < or > or some other stray HTML elements after the splitting.
So you could start by saving the initial .html as .txt, removing the html, head and body elements specifically and going through the general structure of how the program converts the pdf into html. I'm sure you'll see a pattern around how the string 'Quetion' is wrapped in an element and is something you can take care of. That is why I mention .txt files in the code snippets.
You will basically have a bunch of text files with just the content html and not the usual starting tags for an html file because we removed that initially. Then it's only a matter of reading each file, just taking care of the element that surrounds the string 'Question' and adding the html, head and body elements around the content and saving them as .html files. You could do this in any programming language of your choice that supports file reading and writing (would be a fun exercise)
I hope this gets you started in the right direction.

How can I convert an OpenOffice Writer document (.odt) to multiple HTML files with navigation?

I have an OpenOffice Writer document (.odt) with a table of contents, sections, subsections, etc.
Is there a quick way to convert (export) this into multiple HTML files with a navigation sidebar, converting the sections into links?
You can:
Unzip the odt, parse the XML and make the HTML file yourself.
Use OpenOffice to export the document to HTML.
There are several ways to export HTML from OpenOffice or LibreOffice:
Use File > Export, then select file type XHMTL. However, this creates one big HTML file, not multiple files.
Use File > Save as, then select file type HTML document. This creates one big HTML file which is similar but not fully equal to the one above.
Use File > Send > Create HTML document. In the following dialog, you can select a style used in the document based on which the document is split into multiple HTML files. However, I did not get this to work properly. My document is always split on level 1, no matter what I selected here.
Use File > Wizards > Web page. You will get multiple settings to chose from. However, this does not work at all for me. It either fails completely or it does not produce the expected output.
The last two solutions were found on the OpenOffice Wiki at https://wiki.openoffice.org/wiki/Documentation/OOo3_User_Guides/Getting_Started/Saving_Writer_documents_as_web_pages
As a conclusion, I cannot provide a complete solution. I am still looking for a good way to solve this problem.