debian/0000755000000000000000000000000012302375657007200 5ustar debian/docs0000644000000000000000000000000711611635070010036 0ustar docs/* debian/preinst0000644000000000000000000000030011611635070010566 0ustar #!/bin/sh # TODO: remove this file after releasing Squeeze set -e if [ "$1" = upgrade ] && dpkg --compare-versions "$2" lt 1.2-3 then pycentral pkgremove python-chardet fi #DEBHELPER# debian/copyright0000644000000000000000000000775011611635070011132 0ustar This package was debianized by Piotr Ożarowski on Tue, 28 Jun 2006 11:34:00 +0200. It was downloaded from http://chardet.feedparser.org/ Upstream Author: Mark Pilgrim Copyright (C) 2006, 2007, 2008 Mark Pilgrim This library is free software; you can redistribute it and/or modify it under the terms of the GNU Lesser General Public License as published by the Free Software Foundation; either version 2.1 of the License, or (at your option) any later version. This library is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public License for more details. You should have received a copy of the GNU Lesser General Public License along with this library; if not, write to the Free Software Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA ----------------------------------------------------------- The Original Code is mozilla.org code. The Initial Developer of the Original Code is Netscape Communications Corporation. Portions created by the Initial Developer are Copyright (C) 1998, 2005 the Initial Developer. All Rights Reserved. Contributor(s): Mark Pilgrim - port to Python This library is free software; you can redistribute it and/or modify it under the terms of the GNU Lesser General Public License as published by the Free Software Foundation; either version 2.1 of the License, or (at your option) any later version. This library is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public License for more details. You should have received a copy of the GNU Lesser General Public License along with this library; if not, write to the Free Software Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA ----------------------------------------------------------- The Universal Encoding Detector documentation is copyright (C) 2006-2008 Mark Pilgrim. All rights reserved. Redistribution and use in source (XML DocBook) and "compiled" forms (SGML, HTML, PDF, PostScript, RTF and so forth) with or without modification, are permitted provided that the following conditions are met: Redistributions of source code (XML DocBook) must retain the above copyright notice, this list of conditions and the following disclaimer unmodified. Redistributions in compiled form (transformed to other DTDs, converted to PDF, PostScript, RTF and other formats) must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution. THIS DOCUMENTATION IS PROVIDED BY THE AUTHOR "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS DOCUMENTATION, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. ----------------------------------------------------------- debian/chardet script: Copyright © 2008–2009 Ben Finney ----------------------------------------------------------- On Debian systems, the complete text of the GNU Library General Public License can be found in the file `/usr/share/common-licenses/LGPL-2.1'. The Debian packaging is © 2006-2009, Piotr Ożarowski and is licensed under the LGPL. debian/chardet0000644000000000000000000001055311611635070010527 0ustar #! /usr/bin/python # -*- coding: utf-8 -*- # bin/chardet # Part of chardet, the Universal Encoding Detector. # # Copyright © 2008–2009 Ben Finney # # This is free software; you may copy, modify and/or distribute this # work under the terms of the GNU Lesser General Public License; # either version 2.1 or, at your option, any later version. # No warranty expressed or implied. See the file COPYING for details. """ %prog [options] [file ...] Report heuristically-detected character encoding for each file. For every specified file (defaulting to stdin if no files are specified), reads and determines the character encoding of the file content. Reports the name and confidence level for each file's detected character encoding. """ import sys import optparse import chardet class OptionParser(optparse.OptionParser, object): """ Command-line parser for this program """ def __init__(self, *args, **kwargs): """ Set up a new instance """ super(OptionParser, self).__init__(*args, **kwargs) global __doc__ self.usage = __doc__.strip() def detect_encoding(in_file): """ Detect encoding of text in `in_file` Parameters in_file Opened file object to read and examine. Return value The mapping as returned by `chardet.detect`. """ in_data = in_file.read() params = chardet.detect(in_data) return params def report_file_encoding(in_file, encoding_params): """ Return a report of the file's encoding Parameters in_file File object being reported. Should have an appropriate `name` attribute. encoding_params Mapping as returned by `detect_encoding` on the file's data. Return value The report is a single line of text showing filename, detected encoding, and detection confidence. """ file_name = in_file.name encoding_name = encoding_params['encoding'] confidence = encoding_params['confidence'] report = ( "%(file_name)s: %(encoding_name)s" " (confidence: %(confidence)0.2f)") % vars() return report def process_file(in_file): """ Process a single file Parameters in_file Opened file object to read and examine. Return value None. Reads the file contents, detects the encoding, and writes a report line to stdout. """ encoding_params = detect_encoding(in_file) encoding_report = report_file_encoding(in_file, encoding_params) message = "%(encoding_report)s\n" % vars() sys.stdout.write(message) class DetectEncodingApp(object): """ Application behaviour for 'detect-encoding' program """ def __init__(self, argv): """ Set up a new instance """ self._parse_commandline(argv) def _parse_commandline(self, argv): """ Parse command-line arguments """ option_parser = OptionParser() (options, args) = option_parser.parse_args(argv[1:]) self.file_names = args def _emit_file_error(self, file_name, error): """ Emit an error message regarding file processing """ error_name = error.__class__.__name__ message = ( "%(file_name)s: %(error_name)s: %(error)s\n") % vars() sys.stderr.write(message) def _process_all_files(self, file_names): """ Process all files in list """ if not len(file_names): file_names = [None] for file_name in file_names: try: if file_name is None: file_name = sys.stdin.name in_file = sys.stdin else: in_file = open(file_name) process_file(in_file) except IOError, exc: self._emit_file_error(file_name, exc) def main(self): """ Main entry point for application """ self._process_all_files(self.file_names) def __main__(argv=None): """ Mainline code for this program """ from sys import argv as sys_argv if argv is None: argv = sys_argv app = DetectEncodingApp(argv) exitcode = None try: app.main() except SystemExit, e: exitcode = e.code return exitcode if __name__ == "__main__": exitcode = __main__(argv=sys.argv) sys.exit(exitcode) debian/chardet.10000644000000000000000000000157511611635070010672 0ustar .TH CHARDET "1" "November 2009" "chardet 2.0.1" "User Commands" .SH NAME chardet \- universal character encoding detector .SH SYNOPSIS .B chardet [\fIoptions\fR] [\fIfile \fR...] .SH DESCRIPTION Report heuristically\-detected character encoding for each file. .PP For every specified file (defaulting to stdin if no files are specified), reads and determines the character encoding of the file content. Reports the name and confidence level for each file's detected character encoding. .SH OPTIONS .TP \fB\-h\fR, \fB\-\-help\fR show this help message and exit .SH "SEE ALSO" /usr/share/doc/python-chardet/index.html .SH AUTHOR chardet module was written by Mark Pilgrim .PP chardet script was written by Ben Finney .PP This manual page was written by Piotr Ożarowski , for the Debian project (but may be used by others). debian/changelog0000644000000000000000000000351412302375657011055 0ustar chardet (2.0.1-2build2) trusty; urgency=medium * Rebuild to drop files installed into /usr/share/pyshared. -- Matthias Klose Sun, 23 Feb 2014 13:46:23 +0000 chardet (2.0.1-2build1) precise; urgency=low * Rebuild to drop python2.6 dependencies. -- Matthias Klose Sat, 31 Dec 2011 02:01:36 +0000 chardet (2.0.1-2) unstable; urgency=low [ Barry Warsaw ] * Switch to dh_python2, closes: 634313, LP: #788514 [ Piotr Ożarowski ] * Bump Standards-Version to 3.9.2 (no changes needed) * Source format changed to 3.0 (quilt) -- Piotr Ożarowski Wed, 20 Jul 2011 22:28:12 +0200 chardet (2.0.1-1) unstable; urgency=low [ Sandro Tosi ] * Switch Vcs-Browser field to viewsvn [ Piotr Ożarowski ] * New upstream release (no changes in the code) * Add /usr/bin/chardet (thanks to Ben Finney, closes: #479178) * Convert package to dh sequencer and python-support * debian/watch file updated (now points to the Python 2.X version) * Bump Standards-Version to 3.8.3 (no changes needed) -- Piotr Ożarowski Wed, 11 Nov 2009 14:14:10 +0100 chardet (1.0.1-1.1) unstable; urgency=low * NMU. Rebuild to move files to /usr/share/pyshared. Closes: #490452. -- Matthias Klose Fri, 18 Jul 2008 15:58:15 +0000 chardet (1.0.1-1) unstable; urgency=low * New upstream release * New co-maintainer: Mark Pilgrim * Changed my address to piotr@debian.org * Added Vcs-Svn, Vcs-Browser and Homepage fields * Debian packaging is now licenced under LGPL as well * Bump Standards-Version to 3.7.3 (no changes needed) -- Piotr Ożarowski Wed, 05 Mar 2008 20:26:06 +0100 chardet (1.0-1) unstable; urgency=low * Initial release (closes: #375809) -- Piotr Ozarowski Sat, 8 Jul 2006 16:12:00 +0200 debian/install0000644000000000000000000000007611611635070010562 0ustar debian/chardet /usr/bin debian/chardet.1 /usr/share/man/man1/ debian/compat0000644000000000000000000000000211611635070010364 0ustar 5 debian/control0000644000000000000000000000251611611635466010606 0ustar Source: chardet Section: python Priority: optional Maintainer: Piotr Ożarowski Uploaders: Mark Pilgrim , Debian Python Modules Team Build-Depends: python (>= 2.6.6-3~), debhelper (>= 7) Standards-Version: 3.9.2 Homepage: http://chardet.feedparser.org/ Vcs-Svn: svn://svn.debian.org/python-modules/packages/chardet/trunk Vcs-Browser: http://svn.debian.org/viewsvn/python-modules/packages/chardet/trunk/ Package: python-chardet Architecture: all Depends: ${python:Depends}, ${misc:Depends} Description: universal character encoding detector Chardet takes a sequence of bytes in an unknown character encoding, and attempts to determine the encoding. . Supported encodings: * ASCII, UTF-8, UTF-16 (2 variants), UTF-32 (4 variants) * Big5, GB2312, EUC-TW, HZ-GB-2312, ISO-2022-CN (Traditional and Simplified Chinese) * EUC-JP, SHIFT_JIS, ISO-2022-JP (Japanese) * EUC-KR, ISO-2022-KR (Korean) * KOI8-R, MacCyrillic, IBM855, IBM866, ISO-8859-5, windows-1251 (Cyrillic) * ISO-8859-2, windows-1250 (Hungarian) * ISO-8859-5, windows-1251 (Bulgarian) * windows-1252 (English) * ISO-8859-7, windows-1253 (Greek) * ISO-8859-8, windows-1255 (Visual and Logical Hebrew) * TIS-620 (Thai) . This library is a port of the auto-detection code in Mozilla. debian/rules0000755000000000000000000000005411611635103010242 0ustar #!/usr/bin/make -f %: dh $@ --with python2 debian/source/0000755000000000000000000000000011611635242010467 5ustar debian/source/format0000644000000000000000000000001411611635503011675 0ustar 3.0 (quilt) debian/watch0000644000000000000000000000012011611635070010210 0ustar version=3 http://chardet.feedparser.org/download/ python2-chardet-([\d.]+)\.tgz