HTML Parser Using PHP

This is a very simple tutorial to parse HTML dom into PHP array, If we want to display the static or dynamic content of other’s website on your web page, In sort steeling content of other’s website then you can use this method.
But steeling copyright content may create trouble for you, So these types of black hat tricks do your own risk, This article is just for educational purpose.

Let’s Start Trick.

Their is a very useful html dom parser library available on web called “Simple Html Dom”, So here we are going to use this library to parse html content.



Download html parser class file from http://sourceforge.net/projects/simplehtmldom/files/

And create your project directory and create file index.php
include downloaded library in your page,
Below you can see sample script to parse html dom.

index.php

<?php
error_reporting(0);
// call parser library
include('lib/simple_html_dom.php');
$html = file_get_html('http://lab.iamrohit.in');
$posts = array();
foreach($html->find('tr') as $e)
{
  $data = array();
  foreach($e->find('td') as $d)
  {
   $data[] =  $d->innertext;
  }
  array_push($posts, $data);
}
 
echo "<pre>";
print_r($posts);
?>

The above code is parsing my http://lab.iamrohit.in into PHP Array.

With this library you can do lot more things like parse dynamic pages of websites and store other website data in your local database or create API for other’s website which don’t have API without knowing them but limited value.

This is a simple html parser example you can read further more features and tags from here http://simplehtmldom.sourceforge.net/

you can see below command line output.
html-parser

See Output on browser and download sample code.

If you like this post please don’t forget to subscribe My Public Notebook for more useful stuff.