(The webrequest result)
TypeName: Microsoft.PowerShell.Commands.HtmlWebResponseObject
Name MemberType Definition
---- ---------- ----------
Equals Method bool Equals(System.Object obj)
GetHashCode Method int GetHashCode()
GetType Method type GetType()
ToString Method string ToString()
AllElements Property
Microsoft.PowerShell.Commands.WebCmdletElementC...
BaseResponse Property System.Net.WebResponse BaseResponse {get;set;}
Content Property string Content {get;}
Forms Property Microsoft.PowerShell.Commands.FormObjectCollect...
Headers Property System.Collections.Generic.Dictionary[string,st...
Images Property Microsoft.PowerShell.Commands.WebCmdletElementC...
InputFields Property Microsoft.PowerShell.Commands.WebCmdletElementC...
Links Property Microsoft.PowerShell.Commands.WebCmdletElementC...
ParsedHtml Property mshtml.IHTMLDocument2 ParsedHtml {get;}
RawContent Property string RawContent {get;}
RawContentLength Property long RawContentLength {get;}
RawContentStream Property System.IO.MemoryStream RawContentStream {get;}
Scripts Property Microsoft.PowerShell.Commands.WebCmdletElementC...
StatusCode Property int StatusCode {get;}
StatusDescription Property string StatusDescription {get;}
(The All Elements Property)
TypeName: System.Management.Automation.PSCustomObject
Name MemberType Definition
---- ---------- ----------
Equals Method bool Equals(System.Object obj)
GetHashCode Method int GetHashCode()
GetType Method type GetType()
ToString Method string ToString()
innerHTML NoteProperty innerHTML=null
innerText NoteProperty innerText=null
outerHTML NoteProperty outerHTML=null
outerText NoteProperty outerText=null
tagName NoteProperty System.String tagName=!
Here we have a list of all tag elements on the page. These can be as wide as <html> and everything in that, or down to a leaf element like <img>
Going back to parsing of well designed sites, I wanted to write something to check prayer times. In Malaysia there are a few sites that post them, one is www.e-solat.gov.my. In my experience, this site is broken frequently, and with its recent redesign, it looks too complicated to try to parse it. Www.bankislam.com.my on the other hand, labels various parts of their page, so its easy to pull the data. In the raw HTML, we have this
<label class="SolatTime">Solat Time, KL <img src="/_layouts/AtQuest/BankIslam/Images/greyarrow3.jpg" /> Imsak 5:59 | Subuh 6:09 | Syuruk 7:28 | Zuhur 1:29 | Asar 4:51 | Maghrib 7:27 | Isyak 8:39</label><br />
We can see here they have the data in a labelled class "SolatTime". So we can grab that and split up the results, returning a PSobject of times.
$BIsitedata = Invoke-WebRequest -Uri http://www.bankislam.com.my $htmldata = $biSitedata.allelements|where {$_.tagname -eq "Label" -and $_.innerhtml -match "SOLAT"} $result = new-object psobject $htmldata.innertext.split("|") | where {$_ -notmatch "Solat time" } |foreach { $entry = $_.split( ) add-member -inputobject $result NoteProperty $entry[1] $entry[2] } $result
Subuh : 6:09
Syuruk : 7:28
Zuhur : 1:29
Asar : 4:51
Maghrib : 7:27
Isyak : 8:39
No comments:
Post a Comment