PHP 8.4.6 Released!

tidy::repairString

tidy_repair_string

(PHP 5, PHP 7, PHP 8, PECL tidy >= 0.7.0)

tidy::repairString -- tidy_repair_stringRepair a string using an optionally provided configuration file

Description

Object-oriented style

public static tidy::repairString(string $string, array|string|null $config = null, ?string $encoding = null): string|false

Procedural style

tidy_repair_string(string $string, array|string|null $config = null, ?string $encoding = null): string|false

Repairs the given string.

Parameters

string

The data to be repaired.

config

The config config can be passed either as an array or as a string. If a string is passed, it is interpreted as the name of the configuration file, otherwise, it is interpreted as the options themselves.

Check » http://api.html-tidy.org/#quick-reference for an explanation about each option.

encoding

The encoding parameter sets the encoding for input/output documents. The possible values for encoding are: ascii, latin0, latin1, raw, utf8, iso2022, mac, win1252, ibm858, utf16, utf16le, utf16be, big5, and shiftjis.

Return Values

Returns the repaired string, or false on failure.

Changelog

Version Description
8.0.0 tidy::repairString() is a static method now.
8.0.0 config and encoding are nullable now.
8.0.0 This function no longer accepts the useIncludePath parameter.

Examples

Example #1 tidy::repairString() example

<?php
ob_start
();
?>

<html>
<head>
<title>test</title>
</head>
<body>
<p>error</i>
</body>
</html>

<?php

$buffer
= ob_get_clean();
$tidy = new tidy();
$clean = $tidy->repairString($buffer);

echo
$clean;
?>

The above example will output:

<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 3.2//EN">
<html>
<head>
<title>test</title>
</head>
<body>
<p>error</p>
</body>
</html>

See Also

User Contributed Notes

gnuffo1 at gmail dot com
14 years ago
You can also use this function to repair xml, for example if stray ampersands etc are breaking it:

<?php
$xml
= tidy_repair_string($xml, array(
'output-xml' => true,
'input-xml' => true
));
?>
Romolo
7 years ago
Using tidy is very simple to fix a broken ods/odt document
I wrote the following code to be run from command line

<?php
$zip
= new ZipArchive();
if (
$zip->open($argv[1])) {
$fp = $zip->getStream('content.xml'); //file inside archive
if(!$fp)
die(
"Error: can't get stream to document file");
$stat = $zip->statName('content.xml');
$buf = ""; //file buffer
ob_start(); //to capture CRC error message
while (!feof($fp)) {
$buf .= fread($fp, 2048);
}
$s = ob_get_contents();
ob_end_clean();
fclose($fp);
$zip->close();
$config = array(
'indent' => true,
'clean' => true,
'input-xml' => true,
'output-xml' => true,
'wrap' => false
);
$tidy = new Tidy();
$xml = $tidy->repairstring($buf, $config);
$array=split("\n",$xml);
$file=tempnam("/tmp","xml");
$fp=fopen($file,"rw+");
foreach (
$array as $key=>$value) {
fwrite($fp,trim($value),strlen(trim($value)));
if (
$key==0) {
fwrite($fp,"\n");
}
}
fclose($fp);
if (
$zip->open($argv[1]) === TRUE) {
$zip->deleteName('content.xml');
$zip->addFile($file, 'content.xml');
$zip->close();
echo
'recovery complete';
} else {
echo
'recovery failed';
}
unlink($file);
}
?>

save it to a file called fixdoc and invoke as:
php fixdoc yourbrokendoc

for your safety, please work on a copy of your doc.
dan-dot-hunsaker-at-gmail-dot-com
13 years ago
The docs referenced at http://tidy.sourceforge.net/docs/quickref.html above state that the configuration option 'sort-attributes' is an enumeration of 'none' and 'alpha', thereby specifying that strings of either form are the acceptable values. This may not be the case, however - on my system, the option was not honored until I set it to true. This may also be the case with other options, so experiment a bit. The output of tidy::getConfig() may be useful in this regard.
To Top