Latest: Genstatic, my first sip of coffee

Content with Style

Web Technique

XHTML Validation with the W3C validator and PHP

by Pascal Opitz on January 4 2009, 12:25

Amongst other changes, I am working on getting this blog over to use application/xhtml+xml as the content type. Of course this calls for a much stricter validation before content can be put live, otherwise users will be confronted with a broken page. The W3C validator and Zend_Http_Client make validation in PHP easy.

People that remember my validation shell script from last year (cough) already know that there is a SOAP-like response format available from the w3c validator. Like other people I am unhappy that the whole thing is not really a proper SOAP endpoint, but merely the same script that returns a SOAP envelope when the post parameter 'output' is set to 'soap12'.
This is very unfortunate. I wasn't able to use Zend_Soap_Client to construct the request, since the passed parameters are wrapped into a SOAP envelope as well, which the validator doesn't interpret.

Instead I used the Zend_Http_Client to do a POST request, which works neatly, but requires processing of the SOAP response as XML document. Below is an example of a validation controller that validates a URL and handles the SOAP response:


<?php
class Admin_ValidationController extends Zend_Controller_Action
{
  public function indexAction() {
    $url = $this->_request->getParam('url');
    $client = new Zend_Http_Client($url);
    $response = $client->request();
    $fragment = $response->getBody();
    
    $params = array(
      'fragment' => $fragment,
      'output' => 'soap12',
    );
    
    $client = new Zend_Http_Client('http://validator.w3.org/check');
    $client->setParameterPost('fragment', $fragment);
    $client->setParameterPost('output', 'soap12');
    $validator_response = $client->request('POST');
    $soap_response = $validator_response->getBody();
    
    $xml = new DomDocument();
    @$xml->loadXML($soap_response);
    $xpath = new DOMXpath($xml);
    $xpath->registerNamespace("m", "http://www.w3.org/2005/10/markup-validator");
    $elements = $xpath->query("//m:errorcount");
  
    $error_str = '';

    if($elements->item(0) && $elements->item(0)->nodeValue > 0) {
      $errors = $xpath->query("//m:errors/m:errorlist/m:error/m:message");
      foreach ($errors as $node) {
          $error_str .= $node->nodeValue. "\n";
      }
    }

    if(!empty($error_str)) {
      $this->view->message = $error_str;
    } else {      
      $this->view->message = 'Validation of ' . $url . ' passed without errors.';
    }
}

No Zend_Http_Client?

For people that cannot or don't want to use Zend Framework at all (or need a facility to encode post parameters as multipart form data), maybe it's worth having a look at the cURL functions in PHP. They provide another easy interface to do HTTP and even FTP requests. A possible snippet could look like this:


$params = array(
  'fragment' => '<html />',
  'output' => 'soap12',
);

$url = 'http://validator.w3.org/check';

$recieved_headers = "";
$ch = curl_init(); 
curl_setopt($ch, CURLOPT_URL,$url);
curl_setopt($ch, CURLOPT_POST, true);
curl_setopt($ch, CURLOPT_POSTFIELDS, $params); // multipart encoding
curl_setopt($ch, CURLOPT_RETURNTRANSFER,1); 
curl_setopt($ch, CURLOPT_REFERER,''); 
curl_setopt($ch, CURLOPT_FOLLOWLOCATION,1); 
curl_setopt($ch, CURLOPT_TIMEOUT,30);

$recieved_headers = curl_exec($ch);         
if (curl_errno($ch)) {
   print curl_error($ch);
} else {
   curl_close($ch);
}

echo $recieved_headers;   

In this example I didn't include the handling of the SOAP response, but you can easily grab that from the previous example.

Happy validating everyone!

Comments

  • And before you guys are asking: yes, I am still serving as text/html as for now, but did change the DOCTYPE to XHTML 1.0 Strict. In the future I am planning to use content negotiation to serve up the right content type.

    by Pascal Opitz on January 4 2009, 12:53 - #

  • Is there a method to validate multiple websites and show a list so you wan monitor all the websites that you have built? Eg to keep track incase a customer enters invalid code?

    by Gavin McNamee on June 16 2009, 21:21 - #

  • Hi Guys

    How do I make my w3c validator to validate the Public website cause "Private IPs = yes" allow me to only validate local site. Any help will be appreciated

    by spoko on November 11 2009, 14:17 - #


Comments for this article are closed.