This module opens a file and performs automatic charset detection based on the HTML5 algorithm. You can then pass the filehandle to HTML::Parser or a related module (or just read it yourself).
WWW: http://search.cpan.org/dist/IO-HTML/
None
None
None