I'm trying to parse the application/ld+json of a page parsed with node-html-parser I got it all working until I got this unescaped JSON issue where a \n in values is messing things up.
The small bit of JSON causing the issue (rest of JSON has been removed):
{
"name": "3. Given the balanced equation: 2H2(g) +
O2(g) --> 2H2O(l)
How many grams of H2O are formed if 9.00 mol H2(g) reacts
completely with an excess of O2(g)?
The molar mass of H2O is 18.0g/mol.
"
}
I tried using this escape function solution (simply put, str.replace(/[\n]/g, '\\n'), but it broke it.
How might I parse this string, with some values containing random new lines, and how to fix it?
Full Context (just for reference):
Source: https://www.numerade.com/ask/question/3-given-the-balanced-equation-2h2g-o2g-2h2ol-how-many-grams-of-h2o-are-formed-if-900-mol-h2g-reacts-completely-with-an-excess-of-o2g-the-molar-mass-of-h2o-is-180gmol-57997/
<script type="application/ld+json">
{
"#context": "https://schema.org",
"#type": "QAPage",
"mainEntity": {
"#type": "Question",
"name": "3. Given the balanced equation: 2H2(g) +
O2(g) --> 2H2O(l)
How many grams of H2O are formed if 9.00 mol H2(g) reacts
completely with an excess of O2(g)?
The molar mass of H2O is 18.0g/mol.
",
"text": "3. Given the balanced equation: 2H2(g) +
O2(g) --> 2H2O(l)
How many grams of H2O are formed if 9.00 mol H2(g) reacts
completely with an excess of O2(g)?
The molar mass of H2O is 18.0g/mol.
",
"answerCount": 4,
"dateCreated": "Oct. 9, 2021, 6:08 p.m.",
"author": {
"#type": "Person",
"name": "Matthew J."
},
"acceptedAnswer": {
"#type": "Answer",
"upvoteCount": 3,
"text": "In this problem, we have to find a mass of H2 that are formed if nine more of age to relax with an excess of so from balanced equation, we can see that two moles of is to Produced two moles of water, Then nine moles of H two must produce nine moles of water. So now we have most of water. We can easily find the mess MS. Of what, which is equally good number of multi multiply by the molar mass, so it is 1 62 g. So we can say that 1 62 g of H 20 are formed with nine more of age to react with Electricly with an excessive or two.",
"dateCreated": "Oct. 13, 2021, 5:12 p.m.",
"url": "https://www.numerade.com/ask/question/3-given-the-balanced-equation-2h2g-o2g-2h2ol-how-many-grams-of-h2o-are-formed-if-900-mol-h2g-reacts-completely-with-an-excess-of-o2g-the-molar-mass-of-h2o-is-180gmol-57997/",
"author": {
"#type": "Person",
"name": "Taimoor Shabbir"
}
},
"suggestedAnswer": [
{
"#type": "Answer",
"text": "So we're keeping the reaction between hydrogen all section when both off the gas, they ah ah, in the container. So a spark, An initial initiated reaction to your form. Water. So here we have soup on serial fees there from five graham off hydrogen and also syrup on Syria. When a five month off oxygen, what would be the mass of water being produce? So first of all, we have to Bannister can call the Ashram. Um, as you can see, that we have to Ah, hydrogen on the on the left. So we have it for two under, right? And then we have four hydrogen. So on the love we were just put put to you from the hydrogen and then the whole reaction Spartans. Okay, so the next step before we do anything is convert anything. That is not your number. Motion on verbal. So we're still points several feet. That's 75 year bites You where? Sierra 750.1 it if there were 18 eight. Uh, both. Okay, so, um, we have the lumber. I'm also had to found the limiting We agent we had found an emitting region, and then we can just a limbo move in different dimension. We agent to find out the mass off our water. Okay, so then they assume we Ah, I would pick all hydrogen. Assume I have. Ah, them. You have that much hydrogen, You know that? The footy we act so we will. How many oxygen do I need? The militia is 2 to 1 for Ah, hydrogen oxygen. So I just take ah, hydrogen. Um, the more motive I bite you, so I will have Ah, sirrah, point. Seriously. 09 for most of oxygen required to fully re at with. Ah, Hodgins. Okay, So this is the require amounts over here, over here. And there's the actual mouth, so you can see that. Actually, we have a lot off oxygen. So the excess we aging essentially, it's all sitting over here. All right, so let's stab is we're going to Ah, use hydrogen number mostly found that ward number almost again because we have enough also, June. But, uh, I know that the food that we have with all the hydrogen Okay, so the limbo move off water, it would be ah, we can find the from the motorway show and also find from the number of most of hydrogen. So the mother wisher is 2 to 2. Essential is 1 to 1. So it's essentially the same as the number most over here for water. So sue syrup on Syria 18 Ah, for out to the moment of water. Okay, so when we know the number more water we have vowed a mass, we just the more plight, Um, the mass off the mill a massive waters with 18 by the limbo Moe's so so far 018 times 18 And that which you have several 180.338 gram for water. Okay, so, uh, we've already filed the mass of water. We know that excess we agent sausage in. So how about the oxygen? We meaning? All right. So we have that much of all sojourn. 0.185 and then we know that you know that hopefully we have imagined they will consume syrup on sale soon. 94 So we just took our agent Noh Ah, concentration off our region. Know them. Don't move for oxygen. Subtract Ah ah ah! Mom required if we were rehab before their hydrogen and that we should be able to find out is because the syrup on cereal night one most we meaning for oxygen remaining this we meaning. And then we can further, um, convert that back to our master. You corresponding to Syria 0.219 Grandma Oxygen. We may you know, we asked your picture.",
"dateCreated": "Aug. 11, 2021, 12:50 a.m.",
"upvoteCount": 3,
"url": "https://www.numerade.com/questions/a-mixture-of-00375-g-of-hydrogen-and-00185-mol-oxygen-in-a-closed-container-is-sparked-to-initiate-a/",
"author": {
"#type": "Person",
"name": "Stephen Ho"
}
},
{
"#type": "Answer",
"text": "The reaction equation in this question is based on the same reaction equation that we had in the previous question. Now, during this reaction, five moles of hydrogen gas reacts with 0.15 moles of oxygen gas in order to produce a certain amount of water. We need to identify the limiting reactant here and also calculate the number of moles of water that can form during this reaction. For this purpose. We will look at two different situations in order to identify the limiting reactant first and that is, um firstly, we will look at the number of moles of water that can be produced If we start off with five moles of hydrogen gas, and secondly, we will look at the number of moles of water that can be produced when starting off with a 0.15 zero moles of oxygen gas. We will then compare these two situations in order to identify the limiting reactant. So, firstly, In order to determine the number of moles of water that can form when starting off with five miles of hydrogen gas, we need to work with the more ratio of water to hydrogen gas. So for this purpose, we will have a look at the stock geometric coefficients here. For water, it is 24 hydrogen gas, it is too, so that more ratio of water um over hydrogen guests is to over two. We can therefore say That the number of moles of water that can form in this case will be one times the number of moles of hydrogen gas, And this is equal to one times five moles, Which is just equal to five moles. Right now, let's look at the second situation where we start off with 0.15 moles of oxygen. Once again, we need to make use of the mole ratio. So in this case it will be to over one. So the number of moles of water with a number of moles of oxygen will be to over one. This means that the number of moles of water that can form in this case is two times the number of moles off oxygen gas. So this is two times uh 1.50 moles, and this is equal to three moles. Right? So in the first situation, when we started off with five miles off hydrogen gas, We were able to form five moles of water. But in the case of oxygen, we start off with oxygen Um and specifically 1.50 moles of oxygen. Then we can only end up with 3.00 moles of water, which is the least amount produced in the two situations. So because we can only produce a maximum A number of moles of three moles of water, this indicates that oxygen is the limiting reactant here. Oxygen is the limiting reactant. And if we start off with 0.150 miles of oxygen, Then we can produce three moles of water. Right? So to recap in this reaction, we had to identify the limiting reactant first. For this purpose, we compare the number of moles of water that um could form, starting off with the different number of moles off either. Um First of all, we looked at hydrogen gas and then on the other hand, the number of moles of oxygen gas. So in this way we realized that the limiting reactant is oxygen gas because it can only form three moles of water compared to the five mills that can be formed when we start off with the hydrogen gas, is the reactant. Now, if we start off with oxygen 0.15 moles of oxygen gas, then um three miles of water was formed in the end",
"dateCreated": "Aug. 11, 2021, 12:50 a.m.",
"upvoteCount": 3,
"url": "https://www.numerade.com/questions/if-500-mathrmmol-of-hydrogen-gas-and-150-mathrmmol-of-oxygen-gas-react-what-is-the-limiting-reactant/",
"author": {
"#type": "Person",
"name": "Marietjie Lutz"
}
},
{
"#type": "Answer",
"text": "for this problem, we're gonna be working on understanding limiting reactions and using them to solve for products, were given that we have this chemical equation four, NH three plus 502 yields four N O plus six H 20 Were given that we have 2.35 moles of NH three and 2.75 moles of 02 To work with, we need to understand which of these reactions is the limiting reactant and then use that to figure out how much water we're will be produced. It's important to figure out which of these is the limiting reactant, because this reaction will only go so far as that reaction allows. So, to figure out which one is the limiting reactant, we can choose to use either the NH three or the 02 It doesn't matter. I'm gonna go with the NH three. So I'm going to lay out what I have, I have 2.35 moles of NH three. Next thing I'm gonna do is I'm going to look at our ratio by looking at the coefficients in our balanced equation and see that for every four moles of NH three, we're going to also use five moles of 02 So the way to work this out is I'm going to multiply 2.35 by five and then whatever I get from that, I will then divide by four. And when I do that I get 2.94 malls of 02 because our moles of NH three will cancel out. So then I'm going to go look at how much 02 were given. 2.75 Well, that is that is less than 2.94 So, what this means is that we do not have enough 02 to fully react with the NH three that were given. So that means that are limiting reactant is 02 The next thing we're gonna do is we're going to use that limiting reactant to solve for another ratio like this to find out how much water is going to be produced. So we have 2.75 moles of 02 to work with. We're going to set up our ratio again for every five moles of 02 we can create six moles of H 20 I'm going to multiply 2.75 by six and then divide that answer by five to get 3.30 moles of H 20 and that is how much water we can produce.",
"dateCreated": "Aug. 11, 2021, 12:50 a.m.",
"upvoteCount": 3,
"url": "https://www.numerade.com/questions/in-the-following-reaction-235-mathrmmol-of-mathrmnh_3-reacts-with-275-mathrmmol-of-mathrmo_2-how-man/",
"author": {
"#type": "Person",
"name": "Shaelyn Deal"
}
},
{
"#type": "Answer",
"text": "in this question, Methane gas reacts with oxygen in order to form carbon dioxide and water. So this is a combustion reaction and we start off with one mole of methane gas and five moles of oxygen gas. Now we need to determine the limiting reactant here so that we can determine the number of moles of water that perform in the end. In order to determine that limiting reactant, we will compare the number of moles of water that can be formed Firstly, if we start off with one mole of the fungus and secondly, then if we start off with five miles off oxygen gas. So when we compare these two situations, we will be able to identify the limiting reactive. Now, in order to determine the number of malls Can be formed from one mole of methane gas, we make use of the mole ratio. So we know that geometric coefficient of water six and has a documentary coefficient of two. So the mole ratio of water to six, the kids. Therefore the number of moles of water can be formed in this case 6/2", 3 times the number of Malzahn. Yes. Right. And we know we saw it off with um five starting with one mole of methane gas. And therefore this number of northern waters will be $3.1 which is equal to three months. Right. So let's have a look at the second situation where we choose to start off with the oxygen as our reacted. No, once again, you wanted to To calculate the number of moles of water that can be produced. We need to make use of the mole ratio more racial. In this case of water to oxygen is 6-7. So the number of moles of water Over the number of number of moles of oxygen will be 6, 7. So the number of moles of water that can be 46 or seven times oxygen. So it's 6/7 times. We started off with five levels of oxygen, six of the seven times 5 And that is equal to round off to two decimal places. 4.29. Uh huh. So now I need to compare the number of moles that can be formed in these two different situations to firstly, when I started off with one more of methane gas, The reaction um we're able to produce three moles of water. But if I started off with five months of oxygen gas, We actually were able to produce, 4.29 mi of water. So therefore the maximum number of moles that can be produced in this case. Yes, three. And that is from using um it's in gas, which is the limiting reactant. So we started off with a balanced equation and we had to identify the limiting reactant in this case by and we did that by comparing the number of moles of water that could form by starting off in the first place with one more of anything goes. In the second place, five moles of oxygen within. Saw that um The methane gas could not produce more than three miles. Um whereas the oxygen case of the oxygen, Um the reaction produced 4.29 moles. So because we could only produce three moles by using one move methane gas, this is the maximum number of moles that could be produced in this reaction. And therefore we also know that um the limiting reacting to here is a same gas.",
"dateCreated": "Aug. 11, 2021, 12:50 a.m.",
"upvoteCount": 3,
"url": "https://www.numerade.com/questions/if-100-mathrmmol-of-ethane-gas-and-500-mathrmmol-of-oxygen-gas-react-what-is-the-limiting-reactant-a/",
"author": {
"#type": "Person",
"name": "Marietjie Lutz"
}
}
]
}
}
</script>
Basically, I was trying to read broken JSON, as Felix mentioned JSON cannot contain literal line breaks.
Solution: use https://www.npmjs.com/package/jsonrepair module. It detected the bad lines and fixed them, this is likely what google does (some sort of JSON repair).
PS: I tried https://www.npmjs.com/package/json-fixer without success
I have a csv file named movie_reviews.csv and the data inside looks like this:
1 Pixar classic is one of the best kids' movies of all time.
1 Apesar de representar um imenso avanço tecnológico, a força
1 It doesn't enhance the experience, because the film's timeless appeal is down to great characters and wonderful storytelling; a classic that doesn't need goggles or gimmicks.
1 As such Toy Story in 3D is never overwhelming. Nor is it tedious, as many recent 3D vehicles have come too close for comfort to.
1 The fresh look serves the story and is never allowed to overwhelm it, leaving a beautifully judged yarn to unwind and enchant a new intake of young cinemagoers.
1 There's no denying 3D adds extra texture to Pixar's seminal 1995 buddy movie, emphasising Buzz and Woody's toy's-eye- view of the world.
1 If anything, it feels even fresher, funnier and more thrilling in today's landscape of over-studied demographically correct moviemaking.
1 If you haven't seen it for a while, you may have forgotten just how fantastic the snappy dialogue, visual gags and genuinely heartfelt story is.
0 The humans are wooden, the computer-animals have that floating, jerky gait of animated fauna.
1 Some thrills, but may be too much for little ones.
1 Like the rest of Johnston's oeuvre, Jumanji puts vivid characters through paces that will quicken any child's pulse.
1 "This smart, scary film, is still a favorite to dust off and take from the ""vhs"" bin"
0 All the effects in the world can't disguise the thin plot.
the first columns with 0s and 1s is my label.
I want to first turn the texts in movie_reviews.csv into vectors, then split my dataset based on the labels (all 1s to train and 0s to test). Then feed the vectors into a classifier like random forest.
For such a task you'll need to parse your data first with different tools. First lower-case all your sentences. Then delete all stopwords (the, and, or, ...). Tokenize (an introduction here: https://medium.com/#makcedward/nlp-pipeline-word-tokenization-part-1-4b2b547e6a3). You can also use stemming in order to keep anly the root of the word, it can be helpful for sentiment classification.
Then you'll assign an index to each word of your vocabulary and replace words in your sentence by these indexes :
Imagine your vocabulary is : ['i', 'love', 'keras', 'pytorch', 'tensorflow']
index['None'] = 0 #in case a new word is not in your vocabulary
index['i'] = 1
index['love'] = 2
...
Thus the sentence : 'I love Keras' will be encoded as [1 2 3]
However you have to define a maximum length max_len for your sentences and when a sentence contain less words than max_len you complete your vector of size max_len by zeros.
In the previous example if your max_len = 5 then [1 2 3] -> [1 2 3 0 0].
This is a basic approach. Feel free to check preprocessing tools provided by libraries such as NLTK, Pandas ...
MINOR EDIT: I say below that JPL's Horizons library is not open source. Actually, it is, and it's available here: http://naif.jpl.nasa.gov/naif/tutorials.html
At 2013-01-01 00:00:00 UTC at 0 degrees north latitude, 0 degrees east
latitude, sea level elevation, what is the J2000 epoch right ascension
and declination of the moon?
Sadly, different libraries give slightly different answers. Converted
to degrees, the summarized results (RA first):
Stellarium: 141.9408333000, 9.8899166666 [precision: .0004166640, .0000277777]
Pyephem: 142.1278749990, 9.8274722221 [precision .0000416655, .0000277777]
Libnova: 141.320712606865, 9.76909442356909 [precision unknown]
Horizons: 141.9455833320, 9.8878888888 [precision: .0000416655, .0000277777]
My question: why? Notes:
I realize these differences are small, but:
I use pyephem and libnova to calculate sun/moon rise/set, and
these times can be very sensitive to position at higher latitudes
(eg, midnight sun).
I can understand JPL's Horizons library not being open source,
but the other three are. Shouldn't someone work out the
differences in these libraries and merge them? This is my main
complaint. Do the stellarium/pyephem/libnova library authors have
a fundamental difference in how to make these calculations, or do
they just need to merge their code?
I also realize there might be other reasons the calculations are
different, and would appreciate any help in rectifying these
possible errors:
Pyephem and Libnova may be using the epoch of the date instead of J2000
The moon is close enough that observer location can affect its
RA/DEC (parallax effect).
I'm using Perl's Astro::Nova and Python's pyephem, not the
original C implementations of these libraries. However, if these
differences are caused by using Perl/Python, that is important in
my opinion.
My code (w/ raw results):
First, Perl and Astro::Nova:
#!/bin/perl
# RA/DEC of moon at 0N 0E at 0000 UTC 01 Jan 2013
use Astro::Nova;
# 1356998400 == 01 Jan 2013 0000 UTC
$jd = Astro::Nova::get_julian_from_timet(1356998400);
$coords = Astro::Nova::get_lunar_equ_coords($jd);
print join(",",($coords->get_ra(), $coords->get_dec())),"\n";
RESULT: 141.320712606865,9.76909442356909
- Second, Python and pyephem:
#!/usr/local/bin/python
# RA/DEC of moon at 0N 0E at 0000 UTC 01 Jan 2013
import ephem; e = ephem.Observer(); e.date = '2013/01/01 00:00:00';
moon = ephem.Moon(); moon.compute(e); print moon.ra, moon.dec
RESULT: 9:28:30.69 9:49:38.9
- The stellarium result (snapshot):
- The JPL Horizons result (snapshot):
[JPL Horizons requires POST data (not really, but pretend), so I
couldn't post a URL].
I haven't linked them (lazy), but I believe there are many
unanswered questions on stackoverflow that effectively reduce to
this question (inconsistency of precision astronomical libraries),
including some of my own questions.
I'm playing w this stuff at: https://github.com/barrycarter/bcapps/tree/master/ASTRO
I have no idea what Stellarium is doing, but I think I know about the other three. You are correct that only Horizons is using J2000 instead of the epoch-of-date for this apparent, locale-specific observation. You can bring it into close agreement with PyEphem by clicking "change" next to the "Table Settings" and switching from "1. Astrometric RA & DEC" to "2. Apparent RA & DEC."
The difference with Libnova is a bit trickier, but my late-night guess is that Libnova uses UT instead of Ephemeris Time, and so to make PyEphem give the same answer you have to convert from one time to the other:
import ephem
moon, e = ephem.Moon(), ephem.Observer()
e.date = '2013/01/01 00:00:00'
e.date -= ephem.delta_t() * ephem.second
moon.compute(e)
print moon.a_ra / ephem.degree, moon.a_dec / ephem.degree
This outputs:
141.320681918 9.77023197401
Which is, at least, much closer than before. Note that you might also want to do this in your PyEphem code if you want it to ignore refraction like you have asked Horizons to; though for this particular observation I am not seeing it make any difference:
e.pressure = 0
Any residual difference is probably (but not definitely; there could be other sources of error that are not occurring to me right now) due to the different programs using different formulae to predict where the planets will be. PyEphem uses the old but popular VSOP87. Horizons uses the much more recent — and exact — DE405 and DE406, as stated in its output. I do not know what models of the solar system the other products use.
Ideally I could specify something like 10 as my input (in ounces) and get back a string like this: "1 & 1/4 cups". Is there a library that can do something like this? (note: I am totally fine with the rounding implicit in something like this).
Note: I would prefer a C library, but I am OK with solutions for nearly any language as I can probably find appropriate bindings.
It is really two things: 1) the data encompassing the conversion, 2) the presentation of the conversion.
The second is user choice: If you want fractions, you need to write or get a fractions library. There are many.
The first is fairly easy. The vast majority of conversions are just a factor. Usually you will organize known factors into a conversion into the appropriate SI unit for that type of conversion (volume, length, area, density, etc.)
Your data then looks something like this:
A acres 4.046870000000000E+03 6
A ares 1.000000000000000E+02 15
A barns 1.000000000000000E-28 15
A centiares 1.000000000000000E+00 15
A darcys 9.869230000000000E-13 6
A doors 9.290340000000000E+24 6
A ferrados 7.168458781362010E-01 6
A hectares 1.000000000000000E+04 15
A labors 7.168625518000000E+05 6
A Rhode Island 3.144260000000000E+09 4
A sections 2.590000000000000E+06 6
A sheds 1.000000000000000E-48 15
A square centimeters 1.000000000000000E-04 15
A square chains (Gunter's or surveyor's) 4.046860000000000E+02 6
A square chains (Ramsden's) 9.290304000000000E+02 5
A square feet 9.290340000000000E-02 6
A square inches 6.451600000000000E-04 15
A square kilometers 1.000000000000000E+06 15
A square links (Gunter's or surveyor's) 4.046900000000000E-02 5
A square meters (SI) 1.000000000000000E+00 15
A square miles (statute) 2.590000000000000E+06 7
A square millimeter 1.000000000000000E-06 15
A square mils 6.451610000000000E-10 5
A square perches 2.529300000000000E+01 5
A square poles 2.529300000000000E+01 5
A square rods 2.529300000000000E+01 5
A square yards 8.361270000000000E-01 6
A townships 9.324009324009320E+07 5
In each case, these are area conversions into the SI unit for area -- square meters. Then make a second conversion into the the desired conversion. The third number there is significant digits.
Keep a file of these for the desired factors and then you can convert from any area to any area that you have data on. Repeat for other categories of conversion (Volume, Power, Length, Weight, etc etc etc)
My thoughts were using Google Calculator for this task if you want generic conversions...
Example: http://www.google.com/ig/calculator?q=10%20ounces%20to%20cups -- returns JSON, but I believe you can specify format.
Here's a Java example for currency conversion:
http://blog.caplin.com/2011/01/06/simple-currency-conversion-using-google-calculator-and-java/
Well, for a quick and dirty solution you could always have it run GNU Units as an external program. If your software is GPL compatible you can even rip off the code from Units and use it in your program.
Please check out JSR 363, the Units of Measurement Standard for Java: http://unitsofmeasurement.github.io/
At least in C++ you get basic support via "value types" already, but you still have to implement those conversions yourself or find a suitable library similar to what JSR 363 offers for Java.