Let's assume that we have three tables, tbl_student ,tbl_lesson and tbl_score(student_id, lesson_id, score).
for Activedataprovider, we have:
$query = \app\models\Score::find()
->joinWith('student', false)
->joinWith('lesson', false);
$provider = new \yii\data\ActiveDataProvider([
'query' => $array,
]);
echo GridView::widget([
'dataProvider' => $dataProvider,
]);
So it gives an output with columns : student_id, lesson_id, score:
<table border="1">
<tr>
<td> student_id </td>
<td> lesson_id </td>
<td> score </td>
</tr>
<tr>
<td> 1 </td>
<td> 1 </td>
<td> 99.5 </td>
</tr>
<tr>
<td> 1 </td>
<td> 2 </td>
<td> 54 </td>
</tr>
<tr>
<td> 1 </td>
<td> 3 </td>
<td> 87 </td>
</tr>
<tr>
<td> 2 </td>
<td> 1 </td>
<td> 76 </td>
</tr>
<tr>
<td> 2 </td>
<td> 2 </td>
<td> 84 </td>
</tr>
<tr>
<td> 2 </td>
<td> 3 </td>
<td> 69 </td>
</tr>
</table>
But what I want is to display students in the first COLUMN, and lessons in the first ROW and then display the associated scores in the body of the table:
<table border="1">
<tr>
<td>student_id</td>
<td>lesson_1</td>
<td>lesson_2</td>
<td>lesson_3</td>
</tr>
<tr>
<td>1</td>
<td>99.5</td>
<td>54</td>
<td>87</td>
</tr>
<tr>
<td>2</td>
<td>76</td>
<td>84</td>
<td>69</td>
</tr>
</table>
How can I do that?
Thanks in advance.
I know that I can use Arraydataprovider, but as said in Yii2 data providers guide
Note: Compared to Active Data Provider and SQL Data Provider, array data provider is less efficient because it requires loading all data into the memory.
What I want:
but I want to do it using Activedataprovider
Related
I am trying to extract some information from a table which appears on a webpage, but the table is unstructured with row being header and column being content like this: (My apologies for not disclosing the webpage)
<table class="table-detail">
<tbody>
<tr>
<td colspan="4" class="noborder">General Information
</td>
</tr>
<tr>
<th>Full name</th>
<td>
James Smith
</td>
<th>Year of birth</th>
<td>1992</td>
</tr>
<tr>
<th>Gender</th>
<td>Male</td>
</tr>
<tr>
<th>Place of birth</th>
<td>TTexas, USA</td>
<td> </td>
<td> </td>
</tr>
<tr>
<th>Address</th>
<td>Texas, USA</td>
<td> </td>
<td></td>
</tr>
At the moment, I am able to extract the table by using this script:
import pandas as pd
import requests
url = "example.com"
r = requests.get(url)
df_list = pd.read_html(r.text)
df = df_list[0]
df.head()
df.to_csv('myfile.csv',encoding='utf-8-sig')
And the table essentially looks like the following:
However, I am a little stuck with how to achieve this on Python. I cannot seem to get my head around to getting the data. The result I want is as below:
Any help would be appreciated. Thank you so much in advance.
You can use beautifulsoup to parse the HTML. For example:
import pandas as pd
from bs4 import BeautifulSoup
txt = '''<table class="table-detail">
<tbody>
<tr>
<td colspan="4" class="noborder">General Information
</td>
</tr>
<tr>
<th>Full name</th>
<td>
James Smith
</td>
<th>Year of birth</th>
<td>1992</td>
</tr>
<tr>
<th>Gender</th>
<td>Male</td>
</tr>
<tr>
<th>Place of birth</th>
<td>TTexas, USA</td>
<td> </td>
<td> </td>
</tr>
<tr>
<th>Address</th>
<td>Texas, USA</td>
<td> </td>
<td></td>
</tr>'''
soup = BeautifulSoup(txt, 'html.parser')
row = {}
for h in soup.select('th:has(+td)'):
row[h.text] = h.find_next('td').get_text(strip=True)
df = pd.DataFrame([row])
print(df)
Prints:
Full name Year of birth Gender Place of birth Address
0 James Smith 1992 Male TTexas, USA Texas, USA
I want to do use for loop in list of groovy , and print the number of i in table .How can I do this with groovy?
<tr>
<td>Options</td>
<td>
#{list i=0 ,items: optionitem, as:'optionitem' , i++}
<td> i </td>
<td>${optionitem}</td>
<td>
#{/list}
</tr>
Assuming you're using Grails...
<tr>
<td>Options</td>
<td>
<g:each in="${options}" var="optionitem" status="i" >
<td>${i}</td>
<td>${optionitem}</td>
</g:each>
</td>
</tr>
This is my table and the table name is "ad_publisher_details". I want to find the maximum publisher company name means the company name who has most number of rows. Can anyone help?
Table:-
<table>
<thead>
<th>
ad_id
</th>
<th>
publisher_name
</th>
<th>
publisher_company
</th>
</thead>
<tbody>
<tr>
<td>
1
</td>
<td>
bikroy manager
</td>
<td>
bikroy.com
</td>
</tr>
<tr>
<td>
2
</td>
<td>
olx manager
</td>
<td>
olx.com
</td>
</tr>
<tr>
<td>
3
</td>
<td>
microsoft manager
</td>
<td>
microsoft bangladesh
</td>
</tr>
<tr>
<td>
4
</td>
<td>
microsoft manager
</td>
<td>
microsoft bangladesh
</td>
</tr>
<tr>
<td>
5
</td>
<td>
marketing manager
</td>
<td>
land rover
</td>
</tr>
</tbody>
</table>
After SQL query it will return only the publisher_company value "Microsoft Bangladesh". As it has the most row.
You should:
GROUP BY publisher
ORDER BY number of rows, in descending order.
LIMIT the results to single row.
In other words you can do:
SELECT publisher_company
FROM ad_publisher_details
GROUP BY publisher_company
ORDER BY COUNT(*) DESC
LIMIT 1;
And at last i found the solution. I have to do a lot more php code to do so. Here is my solution.
<?php
$i=0;
$j=0;
$br_temp=0;
$query1="SELECT publisher_company FROM ad_publisher_details WHERE 1";
$result1=mysql_query($query1,$dbs);
while($row1=mysql_fetch_array($result1))
{
$publisher_company[$i]=$row1['publisher_company'];
$i++;
}
for($i=0;$i<count($publisher_company);$i++)
{
$temp2= $publisher_company[$i];
$query2="SELECT COUNT(ad_id) AS count_ad FROM ad_publisher_details WHERE publisher_company='$temp2'";
$result2=mysql_query($query2,$dbs);
while($row2=mysql_fetch_array($result2))
{
$most_ad[$j]=$row2['count_ad'];
$j++;
}
}
$highest_ad = max($most_ad);
for($i=0;$i<count($publisher_company);$i++)
{
$i_temp = $publisher_company[$i];
$query3="SELECT COUNT(ad_id) AS match_ad FROM ad_publisher_details WHERE publisher_company='$i_temp'";
$result3=mysql_query($query3,$dbs);
while($row3=mysql_fetch_array($result3))
{
if($highest_ad==$row3['match_ad'])
{
echo $i_temp;
$br_temp=1;
break;
}
if($br_temp==1)
{
break;
}
}
if($br_temp==1)
{
break;
}
}
?>
i need to change quite some html entries in a mysql database. my problem is that some tags need to be replaced while the surrounded code needs to stay the same. in detail: all td-tags in tr-tags with the class "kopf" need to be changed to th-tags (and the addording closing for the tags)
it would not be a problem without the closing tags..
update `tt_content` set `bodytext` = replace(`bodytext`,'<tr class="kopf"><td colspan="2">','<tr><th colspan="2">');
this would work
from what i found the %-sign is used, but how exactly?:
update `tt_content` set `bodytext` = replace(`bodytext`,'<tr class="kopf"><td colspan="2">%</td></tr>','<tr><th colspan="2">%</th></tr>');
i guess this would replace all the code within the old td tags by a %-sign?? how can i achive the needed replacement?
edit: just to clarify things here is a possible entry in the db:
<table class="techDat" > <tbody> <tr class="kopf"> <td colspan="2"> <p><strong>Technical data:</strong></p> </td> </tr> <tr> <td> <p>Operating time depending on battery chargeBetriebszeit je Akkuladung</p> </td> <td> <p>Approx. 4 h</p> </td> </tr> <tr> <td> <p>Maximum volume</p> </td> <td> <p>Approx. 120 dB(A)</p> </td> </tr> <tr> <td> <p>Weight</p> </td> <td> <p>Approx. 59 g</p> </td> </tr> </tbody> </table>
after the mysql replacement it should look like
<table class="techDat" > <tbody> <tr> <th colspan="2"> <p><strong>Technical data:</strong></p> </th> </tr> <tr> <td> <p>Operating time depending on battery chargeBetriebszeit je Akkuladung</p> </td> <td> <p>Approx. 4 h</p> </td> </tr> <tr> <td> <p>Maximum volume</p> </td> <td> <p>Approx. 120 dB(A)</p> </td> </tr> <tr> <td> <p>Weight</p> </td> <td> <p>Approx. 59 g</p> </td> </tr> </tbody> </table>
Try two replaces
update `tt_content` set `bodytext` =
replace(replace(`bodytext`,
'<tr class="kopf"><td colspan="2">','<tr><th colspan="2">'),
'</td></tr>','</th></tr>')
Try updating your records with two queries :
1) for without % sign:
updatett_contentsetbodytext= replace(bodytext,'<tr class="kopf"><td colspan="2">','<tr><th colspan="2">');
2) for % sign
updatett_contentsetbodytext= replace(bodytext,'<tr class="kopf"><td colspan="2">%</td></tr>','<tr><th colspan="2">%</th></tr>')
where instr(bodytext,'%') > 0 ;
Using the following code I am trying to scrape a call log from our phone provider's web application to enter the info into my Ruby on Rails application.
desc "Import incoming calls"
task :fetch_incomingcalls => :environment do
# Logs into manage.phoneprovider.co.uk and retrieved list of incoming calls.
require 'rubygems'
require 'mechanize'
require 'logger'
# Create a new mechanize object
agent = Mechanize.new { |a| a.log = Logger.new(STDERR) }
# Load the Phone Provider website
page = agent.get("https://manage.phoneprovider.co.uk/login")
# Select the first form
form = agent.page.forms.first
form.username = 'username
form.password = 'password
# Submit the form
page = form.submit form.buttons.first
# Click on link called Call Logs
page = agent.page.link_with(:text => "Call Logs").click
# Click on link called Incoming Calls
page = agent.page.link_with(:text => "Incoming Calls").click
# Prints out table rows
# puts doc.css('table > tr')
# Print out the body as a test
# puts page.body
end
As you can see from the last five lines, I have tested that the 'puts page.body' works successfully and the above code works. It successfully logs in and then navigates to Call Logs followed by Incoming Calls.The incoming call table looks like this:
| Timestamp | Source | Destination | Duration |
| 03 Jan 13:40 | 12345678 | 12345679 | 00:01:01 |
| 03 Jan 13:40 | 12345678 | 12345679 | 00:01:01 |
| 03 Jan 13:40 | 12345678 | 12345679 | 00:01:01 |
| 03 Jan 13:40 | 12345678 | 12345679 | 00:01:01 |
Which is generated from the following code:
<thead>
<tr>
<td>Timestamp</td>
<td>Source</td>
<td>Destination</td>
<td>Duration</td>
<td>Cost</td>
<td class='centre'>Recording</td>
</tr>
</thead>
<tbody>
<tr class='o'>
<tr>
<td>03 Jan 13:40</td>
<td>12345678</td>
<td>12345679</td>
<td>00:01:14</td>
<td></td>
<td class='opt recording'>
</td>
</tr>
</tr>
<tr class='e'>
<tr>
<td>30 Dec 20:31</td>
<td>12345678</td>
<td>12345679</td>
<td>00:02:52</td>
<td></td>
<td class='opt recording'>
</td>
</tr>
</tr>
<tr class='o'>
<tr>
<td>24 Dec 00:03</td>
<td>12345678</td>
<td>12345679</td>
<td>00:00:09</td>
<td></td>
<td class='opt recording'>
</td>
</tr>
</tr>
<tr class='e'>
<tr>
<td>23 Dec 14:56</td>
<td>12345678</td>
<td>12345679</td>
<td>00:00:07</td>
<td></td>
<td class='opt recording'>
</td>
</tr>
</tr>
<tr class='o'>
<tr>
<td>21 Dec 13:26</td>
<td>07793770851</td>
<td>12345679</td>
<td>00:00:26</td>
<td></td>
<td class='opt recording'>
</td>
</tr>
</tr>
I'm trying to work out how to selects just the cells I want (Timestamp, Source, Destination and Duration) and output those. I can then worry about outputting them to the database rather than in Terminal.
I have tried using Selector Gadget but it just show either 'td' or 'tr:nth-child(6) td , tr:nth-child(2) td' if I select multiple.
Any help or pointers would be appreciated!
There is a pattern in the table that is easy to leverage using XPath. The <tr> tag of rows with the required information lack the class attribute. Fortunately, XPath provides some simple logical operations including not(). This provides just the functionality we need.
Once we've reduced the number of rows we're dealing with, we can iterate over the rows and extract the text of the necessary columns by using XPath's element[n] selector. One important note here is that XPath counts elements starting from 1, so the first column of a table row would be td[1].
Example code using Nokogiri (and specs):
require "rspec"
require "nokogiri"
HTML = <<HTML
<table>
<thead>
<tr>
<td>
Timestamp
</td>
<td>
Source
</td>
<td>
Destination
</td>
<td>
Duration
</td>
<td>
Cost
</td>
<td class='centre'>
Recording
</td>
</tr>
</thead>
<tbody>
<tr class='o'>
<td></td>
</tr>
<tr>
<td>
03 Jan 13:40
</td>
<td>
12345678
</td>
<td>
12345679
</td>
<td>
00:01:14
</td>
<td></td>
<td class='opt recording'></td>
</tr>
<tr class='e'>
<td></td>
</tr>
<tr>
<td>
30 Dec 20:31
</td>
<td>
12345678
</td>
<td>
12345679
</td>
<td>
00:02:52
</td>
<td></td>
<td class='opt recording'></td>
</tr>
<tr class='o'>
<td></td>
</tr>
<tr>
<td>
24 Dec 00:03
</td>
<td>
12345678
</td>
<td>
12345679
</td>
<td>
00:00:09
</td>
<td></td>
<td class='opt recording'></td>
</tr>
<tr class='e'>
<td></td>
</tr>
<tr>
<td>
23 Dec 14:56
</td>
<td>
12345678
</td>
<td>
12345679
</td>
<td>
00:00:07
</td>
<td></td>
<td class='opt recording'></td>
</tr>
<tr class='o'>
<td></td>
</tr>
<tr>
<td>
21 Dec 13:26
</td>
<td>
07793770851
</td>
<td>
12345679
</td>
<td>
00:00:26
</td>
<td></td>
<td class='opt recording'></td>
</tr>
</tbody>
</table>
HTML
class TableExtractor
def extract_data html
Nokogiri::HTML(html).xpath("//table/tbody/tr[not(#class)]").collect do |row|
timestamp = row.at("td[1]").text.strip
source = row.at("td[2]").text.strip
destination = row.at("td[3]").text.strip
duration = row.at("td[4]").text.strip
{:timestamp => timestamp, :source => source, :destination => destination, :duration => duration}
end
end
end
describe TableExtractor do
before(:all) do
#html = HTML
end
it "should extract the timestamp properly" do
subject.extract_data(#html)[0][:timestamp].should eq "03 Jan 13:40"
end
it "should extract the source properly" do
subject.extract_data(#html)[0][:source].should eq "12345678"
end
it "should extract the destination properly" do
subject.extract_data(#html)[0][:destination].should eq "12345679"
end
it "should extract the duration properly" do
subject.extract_data(#html)[0][:duration].should eq "00:01:14"
end
it "should extract all informational rows" do
subject.extract_data(#html).count.should eq 5
end
end
Your answer lies in this railscasts
http://railscasts.com/episodes/190-screen-scraping-with-nokogiri
This too can help
How do I parse an HTML table with Nokogiri?
You should be able to reach the exact node you required from the root (worst case) using XPath selectors. Using XPath with Nokogiri is listed here.
For detail on how reach all your elements using XPath, look here.