Copy and paste string (Hebrew) in PhpStorm is wrong - phpstorm

Copy and paste string (Hebrew) in PhpStorm is wrong
my string is:
user/3453/99/ddf102/2077/BV/⁨צילום תעודת זהות⁩.pdf
Describe In image:
How can this problem be solved?

Very good question!
You can use this regex to remove non printable chars:
preg_replace('/[[:^print:]]/u', '', $path);
:)

Related

Replacing a string in HTML file in Python

I'm trying to replace a string stored in a list with an HTML tag in a file by doing:
links=[http://hexagon-dashboard-gbc-01/vboard/latest?regs=3281546<!--V68NUR-->]
str1="""%s<!--V68NUR-->"""%(vboard['V68N']['perf.tl'],vboard['V68N']['perf.tl'])
with open(html_file,'r+') as f:
content = f.read()
f.seek(0)
f.truncate()
f.write(content.replace(links[0],str1))
But I get the following error:
TypeError: replace() argument 1 must be str, not Tag.
What am I missing? Please help me with the modification I have to do.
Updated:
From what you posted, I suppose you are treating a html file as plain text and going to perform string replacement.
The replace() function only works when both of its arguments are strings.
The reason you got an error is that links[0] is not a string but a tag.
If you manage to get links like this (note the single quotes)
links=['http://hexagon-dashboard-gbc-01/vboard/latest?regs=3281546<!--V68NUR-->']
then
content.replace(links[0],str1)
would not produce any errors.
To edit html files, you can also use HTML Parser instead.

PHP: creating CSV file with windows encoding

i am creating csv files with php. To write the data into my csv file, i use the php function "fputcsv".
this is the issue:
i can open the created file normally with Excel. But i cant import the file into a shopsystem (in this case "shopware"). It says something like "the data could not be read".
And now comes the clue:
If i open the created file and choose "save as" and select "CSV (comma delimited)" in type, this file can be imported into shopware. I read something about the php function "mb_convert_encoding" which i used to encode the data, but it could not fix the problem.
I will be very glad if you can help me.
thanks.
Thanks for your input.
I solved this problem by replacing fputcsv with fwrite. Then i just needed to add "\r\n" (thanks wmil) to the end of the line and the generated file can be read by shopware.
Obviously the fputcsv function uses \n and not \r\n as EOL character.
I think you cannot set the encode using fputcsv. However fputcsv looks to the locale setting, wich you can change with setlocale.
Maybe you could send your file directly to the users browser and use changing contenttype and charset with header function.
This can't be answered without knowing more about your system. Most likely it has nothing to do with character encoding. It's probably a problem with wrong number of columns or column headers being incorrect.
If it is a character encoding issue, your best bet is:
$new_str = mb_convert_encoding($str, 'Windows-1252', 'auto');
Also end newlines with \r\n, not just \n.
If that doesn't work you'll need to check the software docs.

Json parsing with unicode characters

i have a json file with unicode characters, and i'm having trouble to parse it. I've tried in Flash CS5, the JSON library, and i have tried it in http://json.parser.online.fr/ and i always get "unexpected token - eval fails"
I'm sorry, there realy was a problem with the syntax, it came this way from the client.
Can someone please help me? Thanks
Quoth the RFC:
JSON text SHALL be encoded in Unicode. The default encoding is UTF-8.
So a correctly encoded Unicode character should not be a problem. Which leads me to believe that it's not correctly encoded (maybe it uses latin-1 instead of UTF-8). How did you create the file? In a text editor?
There might be an obscure Unicode whitespace character hidden in your string.
This URL contains more detail:
http://timelessrepo.com/json-isnt-a-javascript-subset
In asp.net you would think you would use System.Text.Encoding to convert a string like "Paul\u0027s" back to a string like "Paul's" but i tried for hours and found nothing that worked.
The trouble is hardcoding a string as shown above already decodes the string as you will see if you put a break point on it so in the end i wrote a function that converts the Hex27 to Dec39 so that i ended up with HTML encodeing and then decoded that.
string Padding = "000";
for (int f = 1; f <= 256; f++)
{
string Hex = "\\u" + Padding.Substring(0, 4 - f.ToString().Length) + f;
string Dec = "&#" + Int32.Parse(f.ToString(), NumberStyles.HexNumber) + ";";
HTML = HTML.Replace(Hex, Dec);
}
HTML = System.Web.HttpUtility.HtmlDecode(HTML);
Ugly as sin, I know but without using the latest framework (Not on ISP's server) it was the best I could do and someone must know a better solution.
I had the same problem and I just change the file encoding type Mac-Roman/windows-1252 to UTF-8.. and it worked
I had the same problem with Twitter json files. I was parsing them in Python with json.loads(tweet) but it failed for half of the records.
I changed to Python3 and it works well now.
If you seem to have trouble with the encoding of a JSON file (i.e. escaped codes such as \u00fc aren't displayed correctly regardless of your editor's encoding setting) generated by Python with json.dump s(): it encodes ASCII by default and escapes the unicode characters! See python json unicode - how do I eval using javascript (and python: json.dumps can't handle utf-8? and Why does json.dumps escape non-ascii characters with "\uxxxx").

Need help writing a REGEX to replace text and characters in matched items

I have large number of HTML files where I have to do a global search on all occurrences of href="" and perform the following on the contents inside the quotes:
%28 = remove
%29 = remove
%2C = remove
%26 = and
%20 = -
_ = -
.htm = .html
lowercase all caps
Any help from someone more knowledgeable in writing regular expressions would be greatly appreciated. I will be entering both the search and replace expressions in Textmate.
Find: (href=".*?)(%28|%29|%2C)(.+?")
Replace: $1$3
Find: (href=".*?)(%26)(.+?")
Replace: $1and$3
Find: (href=".*?)(%20|_)(.+?")
Replace: $1-$3
Find: (href=".*?)(\.htm)
Replace: $1.html
I don't know enough about textmate to help with converting lowercase to all caps.
You also might find the textmate manual regex section

Is there anything to convert HTML special characters in files to normal characters?

I have some source code files which came to me by an HTML output, so they're pretty unusable.
I have things like this:
%include "macros.mac"
Which should be:
%include "macros.mac"
Is there any script (sh, perl, batch, ...) to convert every file (there are about 200) to the appropriate characters? Characters include & lt;, & gt;, ... (I put an space in the middle so that it wont convert them to < and >.
Thank you, it's very appreciated.
If it is just about the four &"<> characters, sed(1) could help:
sed 's/"/"/g; s/</</g; s/>/>/g; s/&/\&/g'
Update: My original proposal was the following and had a bug:
sed 's/&/\&/g; s/"/"/g; s/</</g; s/>/>/g'
This would convert "&gt;" into ">" which is wrong.
you can try a tool like windows grep or textcrawler for this