DOMDocument::saveHTMLFile

(PHP 5, PHP 7)

DOMDocument::saveHTMLFile Dumps the internal document into a file using HTML formatting

说明

public int DOMDocument::saveHTMLFile ( string $filename )

Creates an HTML document from the DOM representation. This function is usually called after building a new dom document from scratch as in the example below.

参数

filename

The path to the saved HTML document.

返回值

Returns the number of bytes written or FALSE if an error occurred.

范例

Example #1 Saving a HTML tree into a file

<?php

$doc 
= new DOMDocument('1.0');
// we want a nice output
$doc->formatOutput true;

$root $doc->createElement('html');
$root $doc->appendChild($root);

$head $doc->createElement('head');
$head $root->appendChild($head);

$title $doc->createElement('title');
$title $head->appendChild($title);

$text $doc->createTextNode('This is the title');
$text $title->appendChild($text);

echo 
'Wrote: ' $doc->saveHTMLFile("/tmp/test.html") . ' bytes'// Wrote: 129 bytes

?>

参见

User Contributed Notes

RiKdnUA at mail dot ru 28-Jul-2013 01:15
saveHTMLFile() always saves the file in UTF-8. Even if the DOMDocument->encoding explicitly prescribe different from UTF-8 encoding. All "non-Latin" characters will be converted to HTML-entities. Tested in PHP 5.2.9-2 and PHP 5.2.17. Example:

<?php
$document
=new domDocument('1.0', 'WINDOWS-1251');
$document->loadHTML('<html><head><title>Russian language</title></head><body>Русский язык</body></html>');
$document->formatOutput=true;
$document->encoding='WINDOWS-1251';
echo
"Записано байт. Recorded bytes: ".$document->saveHTMLFile('html.html');
?>

Method recorded file in UTF-8 encoding. The contents of the file html.html:

<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd">
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; CHARSET=gb2312">
<title>Russian language</title>
</head>
<body>Dó??êèé ???ê</body>
</html>
naebeth at hotmail dot NOSPAM dot com 24-Sep-2012 07:03
Not mentioned in the documentation is the fact that using DOMDocument::saveHTMLFile() will automatically overwrite the contents if an existing file is used - with no notice, warning or error thrown.

Make sure you check the filename before using this function so that you don't accidentally overwrite important files.

Example:
<?php

$file
= fopen('test.html', 'w');
fwrite($file, 'this is some text');
fclose($file);

$doc = new DOMDocument();
$doc->formatOutput = true;
$doc->loadHTML('<html><head><title>Test</title></head><body></body></html>');
$doc->saveHTMLFile('test.html');

// test.html
/*
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd">
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; CHARSET=gb2312">
<title>Test</title>
</head>
<body></body>
</html>
*/

?>

If you're dynamically generating a series of pages using DOMDocument objects, make sure you are also dynamically generating the file or directory names using something that can't easily be confused for an existing file/folder, or check if the desired path already exists before saving so that you don't accidentally delete previous files.
deep42thouSPAMght42 at y_a_h_o_o dot com 14-Jan-2011 12:46
I foolishly assumed that this function was equivalent to
<?php
file_put_contents
($filename, $document->saveHTML());
?>
but there are differences in the generated HTML:
<?php
$doc
= new DOMDocument();
$doc->loadHTML(
   
'<html><head><title>Test</title></head><body></body></html>'
);
$doc->encoding = 'iso-8859-1';

echo
$doc->saveHTML();
#<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd">
#<html>
#<head><title>Test</title></head>
#<body></body>
#</html>

$doc->saveHTMLFile('output.html');
#<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd">
#<html><head><meta http-equiv="Content-Type" content="text/html; CHARSET=gb2312"><title>Test</title></head><body></body></html>

?>
Note that saveHTMLFile() adds a UTF-8 meta tag despite the ISO-8859-1 document encoding.