wcwidth-0.2.5/0000755000175000017500000000000013674424426011705 5ustar zigozigowcwidth-0.2.5/README.rst0000777000175000017500000000000013674424426016206 2docs/intro.rstustar zigozigowcwidth-0.2.5/.python-version0000644000175000017500000000003713674424426014712 0ustar zigozigo3.8.2 3.7.6 3.6.9 3.5.9 2.7.17 wcwidth-0.2.5/docs/0000755000175000017500000000000013674424426012635 5ustar zigozigowcwidth-0.2.5/docs/intro.rst0000644000175000017500000002356513674424426014535 0ustar zigozigo|pypi_downloads| |codecov| |license| ============ Introduction ============ This library is mainly for CLI programs that carefully produce output for Terminals, or make pretend to be an emulator. **Problem Statement**: The printable length of *most* strings are equal to the number of cells they occupy on the screen ``1 charater : 1 cell``. However, there are categories of characters that *occupy 2 cells* (full-wide), and others that *occupy 0* cells (zero-width). **Solution**: POSIX.1-2001 and POSIX.1-2008 conforming systems provide `wcwidth(3)`_ and `wcswidth(3)`_ C functions of which this python module's functions precisely copy. *These functions return the number of cells a unicode string is expected to occupy.* Installation ------------ The stable version of this package is maintained on pypi, install using pip:: pip install wcwidth Example ------- **Problem**: given the following phrase (Japanese), >>> text = u'コンニチハ' Python **incorrectly** uses the *string length* of 5 codepoints rather than the *printible length* of 10 cells, so that when using the `rjust` function, the output length is wrong:: >>> print(len('コンニチハ')) 5 >>> print('コンニチハ'.rjust(20, '_')) _____コンニチハ By defining our own "rjust" function that uses wcwidth, we can correct this:: >>> def wc_rjust(text, length, padding=' '): ... from wcwidth import wcswidth ... return padding * max(0, (length - wcswidth(text))) + text ... Our **Solution** uses wcswidth to determine the string length correctly:: >>> from wcwidth import wcswidth >>> print(wcswidth('コンニチハ')) 10 >>> print(wc_rjust('コンニチハ', 20, '_')) __________コンニチハ Choosing a Version ------------------ Export an environment variable, ``UNICODE_VERSION``. This should be done by *terminal emulators* or those developers experimenting with authoring one of their own, from shell:: $ export UNICODE_VERSION=13.0 If unspecified, the latest version is used. If your Terminal Emulator does not export this variable, you can use the `jquast/ucs-detect`_ utility to automatically detect and export it to your shell. wcwidth, wcswidth ----------------- Use function ``wcwidth()`` to determine the length of a *single unicode character*, and ``wcswidth()`` to determine the length of many, a *string of unicode characters*. Briefly, return values of function ``wcwidth()`` are: ``-1`` Indeterminate (not printable). ``0`` Does not advance the cursor, such as NULL or Combining. ``2`` Characters of category East Asian Wide (W) or East Asian Full-width (F) which are displayed using two terminal cells. ``1`` All others. Function ``wcswidth()`` simply returns the sum of all values for each character along a string, or ``-1`` when it occurs anywhere along a string. Full API Documentation at http://wcwidth.readthedocs.org ========== Developing ========== Install wcwidth in editable mode:: pip install -e. Execute unit tests using tox_:: tox Regenerate python code tables from latest Unicode Specification data files:: tox -eupdate Supplementary tools for browsing and testing terminals for wide unicode characters are found in the `bin/`_ of this project's source code. Just ensure to first ``pip install -erequirements-develop.txt`` from this projects main folder. For example, an interactive browser for testing:: ./bin/wcwidth-browser.py Uses ---- This library is used in: - `jquast/blessed`_: a thin, practical wrapper around terminal capabilities in Python. - `jonathanslenders/python-prompt-toolkit`_: a Library for building powerful interactive command lines in Python. - `dbcli/pgcli`_: Postgres CLI with autocompletion and syntax highlighting. - `thomasballinger/curtsies`_: a Curses-like terminal wrapper with a display based on compositing 2d arrays of text. - `selectel/pyte`_: Simple VTXXX-compatible linux terminal emulator. - `astanin/python-tabulate`_: Pretty-print tabular data in Python, a library and a command-line utility. - `LuminosoInsight/python-ftfy`_: Fixes mojibake and other glitches in Unicode text. - `nbedos/termtosvg`_: Terminal recorder that renders sessions as SVG animations. - `peterbrittain/asciimatics`_: Package to help people create full-screen text UIs. Other Languages --------------- - `timoxley/wcwidth`_: JavaScript - `janlelis/unicode-display_width`_: Ruby - `alecrabbit/php-wcwidth`_: PHP - `Text::CharWidth`_: Perl - `bluebear94/Terminal-WCWidth`: Perl 6 - `mattn/go-runewidth`_: Go - `emugel/wcwidth`_: Haxe - `aperezdc/lua-wcwidth`: Lua - `joachimschmidt557/zig-wcwidth`: Zig - `fumiyas/wcwidth-cjk`: `LD_PRELOAD` override - `joshuarubin/wcwidth9`: Unicode version 9 in C History ------- 0.2.0 *2020-06-01* * **Enhancement**: Unicode version may be selected by exporting the Environment variable ``UNICODE_VERSION``, such as ``13.0``, or ``6.3.0``. See the `jquast/ucs-detect`_ CLI utility for automatic detection. * **Enhancement**: API Documentation is published to readthedocs.org. * **Updated** tables for *all* Unicode Specifications with files published in a programmatically consumable format, versions 4.1.0 through 13.0 that are published , versions 0.1.9 *2020-03-22* * **Performance** optimization by `Avram Lubkin`_, `PR #35`_. * **Updated** tables to Unicode Specification 13.0.0. 0.1.8 *2020-01-01* * **Updated** tables to Unicode Specification 12.0.0. (`PR #30`_). 0.1.7 *2016-07-01* * **Updated** tables to Unicode Specification 9.0.0. (`PR #18`_). 0.1.6 *2016-01-08 Production/Stable* * ``LICENSE`` file now included with distribution. 0.1.5 *2015-09-13 Alpha* * **Bugfix**: Resolution of "combining_ character width" issue, most especially those that previously returned -1 now often (correctly) return 0. resolved by `Philip Craig`_ via `PR #11`_. * **Deprecated**: The module path ``wcwidth.table_comb`` is no longer available, it has been superseded by module path ``wcwidth.table_zero``. 0.1.4 *2014-11-20 Pre-Alpha* * **Feature**: ``wcswidth()`` now determines printable length for (most) combining_ characters. The developer's tool `bin/wcwidth-browser.py`_ is improved to display combining_ characters when provided the ``--combining`` option (`Thomas Ballinger`_ and `Leta Montopoli`_ `PR #5`_). * **Feature**: added static analysis (prospector_) to testing framework. 0.1.3 *2014-10-29 Pre-Alpha* * **Bugfix**: 2nd parameter of wcswidth was not honored. (`Thomas Ballinger`_, `PR #4`_). 0.1.2 *2014-10-28 Pre-Alpha* * **Updated** tables to Unicode Specification 7.0.0. (`Thomas Ballinger`_, `PR #3`_). 0.1.1 *2014-05-14 Pre-Alpha* * Initial release to pypi, Based on Unicode Specification 6.3.0 This code was originally derived directly from C code of the same name, whose latest version is available at http://www.cl.cam.ac.uk/~mgk25/ucs/wcwidth.c:: * Markus Kuhn -- 2007-05-26 (Unicode 5.0) * * Permission to use, copy, modify, and distribute this software * for any purpose and without fee is hereby granted. The author * disclaims all warranties with regard to this software. .. _`tox`: https://testrun.org/tox/latest/install.html .. _`prospector`: https://github.com/landscapeio/prospector .. _`combining`: https://en.wikipedia.org/wiki/Combining_character .. _`bin/`: https://github.com/jquast/wcwidth/tree/master/bin .. _`bin/wcwidth-browser.py`: https://github.com/jquast/wcwidth/tree/master/bin/wcwidth-browser.py .. _`Thomas Ballinger`: https://github.com/thomasballinger .. _`Leta Montopoli`: https://github.com/lmontopo .. _`Philip Craig`: https://github.com/philipc .. _`PR #3`: https://github.com/jquast/wcwidth/pull/3 .. _`PR #4`: https://github.com/jquast/wcwidth/pull/4 .. _`PR #5`: https://github.com/jquast/wcwidth/pull/5 .. _`PR #11`: https://github.com/jquast/wcwidth/pull/11 .. _`PR #18`: https://github.com/jquast/wcwidth/pull/18 .. _`PR #30`: https://github.com/jquast/wcwidth/pull/30 .. _`PR #35`: https://github.com/jquast/wcwidth/pull/35 .. _`jquast/blessed`: https://github.com/jquast/blessed .. _`selectel/pyte`: https://github.com/selectel/pyte .. _`thomasballinger/curtsies`: https://github.com/thomasballinger/curtsies .. _`dbcli/pgcli`: https://github.com/dbcli/pgcli .. _`jonathanslenders/python-prompt-toolkit`: https://github.com/jonathanslenders/python-prompt-toolkit .. _`timoxley/wcwidth`: https://github.com/timoxley/wcwidth .. _`wcwidth(3)`: http://man7.org/linux/man-pages/man3/wcwidth.3.html .. _`wcswidth(3)`: http://man7.org/linux/man-pages/man3/wcswidth.3.html .. _`astanin/python-tabulate`: https://github.com/astanin/python-tabulate .. _`janlelis/unicode-display_width`: https://github.com/janlelis/unicode-display_width .. _`LuminosoInsight/python-ftfy`: https://github.com/LuminosoInsight/python-ftfy .. _`alecrabbit/php-wcwidth`: https://github.com/alecrabbit/php-wcwidth .. _`Text::CharWidth`: https://metacpan.org/pod/Text::CharWidth .. _`bluebear94/Terminal-WCWidth`: https://github.com/bluebear94/Terminal-WCWidth .. _`mattn/go-runewidth`: https://github.com/mattn/go-runewidth .. _`emugel/wcwidth`: https://github.com/emugel/wcwidth .. _`jquast/ucs-detect`: https://github.com/jquast/ucs-detect .. _`Avram Lubkin`: https://github.com/avylove .. _`nbedos/termtosvg`: https://github.com/nbedos/termtosvg .. _`peterbrittain/asciimatics`: https://github.com/peterbrittain/asciimatics .. _`aperezdc/lua-wcwidth`: https://github.com/aperezdc/lua-wcwidth .. _`fumiyas/wcwidth-cjk`: https://github.com/fumiyas/wcwidth-cjk .. |pypi_downloads| image:: https://img.shields.io/pypi/dm/wcwidth.svg?logo=pypi :alt: Downloads :target: https://pypi.org/project/wcwidth/ .. |codecov| image:: https://codecov.io/gh/jquast/wcwidth/branch/master/graph/badge.svg :alt: codecov.io Code Coverage :target: https://codecov.io/gh/jquast/wcwidth/ .. |license| image:: https://img.shields.io/github/license/jquast/wcwidth.svg :target: https://pypi.python.org/pypi/wcwidth/ :alt: MIT License wcwidth-0.2.5/docs/requirements.txt0000644000175000017500000000010013674424426016110 0ustar zigozigoSphinx sphinx-paramlinks sphinx_rtd_theme sphinxcontrib-manpage wcwidth-0.2.5/docs/conf.py0000644000175000017500000001245213674424426014140 0ustar zigozigo#!/usr/bin/env python3 # -*- coding: utf-8 -*- # # wcwidth documentation build configuration file, created by # sphinx-quickstart on Fri Oct 20 15:18:02 2017. # # This file is execfile()d with the current directory set to its # containing dir. # # Note that not all possible configuration values are present in this # autogenerated file. # # All configuration values have a default; values that are commented out # serve to show the default. # If extensions (or modules to document with autodoc) are in another directory, # add these directories to sys.path here. If the directory is relative to the # documentation root, use os.path.abspath to make it absolute, like shown here. # # import os # import sys # sys.path.insert(0, os.path.abspath('.')) # local # 3rd-party imports import wcwidth # -- General configuration ------------------------------------------------ # If your documentation needs a minimal Sphinx version, state it here. # # needs_sphinx = '1.0' # Add any Sphinx extension module names here, as strings. They can be # extensions coming with Sphinx (named 'sphinx.ext.*') or your custom # ones. extensions = ['sphinx.ext.autodoc', 'sphinx.ext.doctest', 'sphinx.ext.intersphinx', 'sphinx.ext.coverage', 'sphinx.ext.viewcode'] # Add any paths that contain templates here, relative to this directory. templates_path = ['_templates'] # The suffix(es) of source filenames. # You can specify multiple suffix as a list of string: # # source_suffix = ['.rst', '.md'] source_suffix = '.rst' # The master toctree document. master_doc = 'index' # General information about the project. project = 'wcwidth' copyright = '2017, Jeff Quast' author = 'Jeff Quast' # The version info for the project you're documenting, acts as replacement for # |version| and |release|, also used in various other places throughout the # built documents. # # The short X.Y version, # The full version, including alpha/beta/rc tags. release = version = wcwidth.__version__ # The language for content autogenerated by Sphinx. Refer to documentation # for a list of supported languages. # # This is also used if you do content translation via gettext catalogs. # Usually you set "language" from the command line for these cases. language = None # List of patterns, relative to source directory, that match files and # directories to ignore when looking for source files. # This patterns also effect to html_static_path and html_extra_path exclude_patterns = [] # The name of the Pygments (syntax highlighting) style to use. pygments_style = 'sphinx' # If true, `todo` and `todoList` produce output, else they produce nothing. todo_include_todos = False # -- Options for HTML output ---------------------------------------------- # The theme to use for HTML and HTML Help pages. See the documentation for # a list of builtin themes. # html_theme = 'alabaster' # Theme options are theme-specific and customize the look and feel of a theme # further. For a list of options available for each theme, see the # documentation. # # html_theme_options = {} # Add any paths that contain custom static files (such as style sheets) here, # relative to this directory. They are copied after the builtin static files, # so a file named "default.css" will overwrite the builtin "default.css". html_static_path = ['_static'] # Custom sidebar templates, must be a dictionary that maps document names # to template names. # # This is required for the alabaster theme # refs: http://alabaster.readthedocs.io/en/latest/installation.html#sidebars # html_sidebars = { # '**': [ # 'about.html', # 'navigation.html', # 'relations.html', # needs 'show_related': True theme option to display # 'searchbox.html', # 'donate.html', # ] # } # -- Options for HTMLHelp output ------------------------------------------ # Output file base name for HTML help builder. htmlhelp_basename = 'wcwidthdoc' # -- Options for LaTeX output --------------------------------------------- latex_elements = { # The paper size ('letterpaper' or 'a4paper'). # # 'papersize': 'letterpaper', # The font size ('10pt', '11pt' or '12pt'). # # 'pointsize': '10pt', # Additional stuff for the LaTeX preamble. # # 'preamble': '', # Latex figure (float) alignment # # 'figure_align': 'htbp', } # Grouping the document tree into LaTeX files. List of tuples # (source start file, target name, title, # author, documentclass [howto, manual, or own class]). latex_documents = [ (master_doc, 'wcwidth.tex', 'wcwidth Documentation', 'Jeff Quast', 'manual'), ] # -- Options for manual page output --------------------------------------- # One entry per manual page. List of tuples # (source start file, name, description, authors, manual section). man_pages = [ (master_doc, 'wcwidth', 'wcwidth Documentation', [author], 1) ] # -- Options for Texinfo output ------------------------------------------- # Grouping the document tree into Texinfo files. List of tuples # (source start file, target name, title, author, # dir menu entry, description, category) texinfo_documents = [ (master_doc, 'wcwidth', 'wcwidth Documentation', author, 'wcwidth', 'One line description of project.', 'Miscellaneous'), ] intersphinx_mapping = {'python': ('https://docs.python.org/3', None)} wcwidth-0.2.5/docs/api.rst0000644000175000017500000000202413674424426014136 0ustar zigozigo========== Public API ========== This package follows SEMVER_ rules for version, therefor, for all of the given functions signatures, at example version 1.1.1, you may use version dependency ``>=1.1.1,<2.0`` for forward compatibility of future wcwidth versions. .. autofunction:: wcwidth.wcwidth .. autofunction:: wcwidth.wcswidth .. autofunction:: wcwidth.list_versions .. _SEMVER: https://semver.org =========== Private API =========== These functions should only be used for wcwidth development, and not used by dependent packages except with care and by use of frozen version dependency, as these functions may change names, signatures, or disappear entirely at any time in the future, and not reflected by SEMVER rules. If stable public API for any of the given functions is needed, please suggest a Pull Request! .. autofunction:: wcwidth._bisearch .. autofunction:: wcwidth._wcversion_value .. autofunction:: wcwidth._wcmatch_version .. autofunction:: wcwidth._get_package_version .. autofunction:: wcwidth._wcmatch_version wcwidth-0.2.5/docs/unicode_version.rst0000644000175000017500000000505613674424426016570 0ustar zigozigo===================== Unicode release files ===================== This library aims to be forward-looking, portable, and most correct. The most current release of this API is based on the Unicode Standard release files: ``DerivedGeneralCategory-4.1.0.txt`` *Date: 2005-02-26, 02:35:50 GMT [MD]* ``DerivedGeneralCategory-5.0.0.txt`` *Date: 2006-02-27, 23:41:27 GMT [MD]* ``DerivedGeneralCategory-5.1.0.txt`` *Date: 2008-03-20, 17:54:57 GMT [MD]* ``DerivedGeneralCategory-5.2.0.txt`` *Date: 2009-08-22, 04:58:21 GMT [MD]* ``DerivedGeneralCategory-6.0.0.txt`` *Date: 2010-08-19, 00:48:09 GMT [MD]* ``DerivedGeneralCategory-6.1.0.txt`` *Date: 2011-11-27, 05:10:22 GMT [MD]* ``DerivedGeneralCategory-6.2.0.txt`` *Date: 2012-05-20, 00:42:34 GMT [MD]* ``DerivedGeneralCategory-6.3.0.txt`` *Date: 2013-07-05, 14:08:45 GMT [MD]* ``DerivedGeneralCategory-7.0.0.txt`` *Date: 2014-02-07, 18:42:12 GMT [MD]* ``DerivedGeneralCategory-8.0.0.txt`` *Date: 2015-02-13, 13:47:11 GMT [MD]* ``DerivedGeneralCategory-9.0.0.txt`` *Date: 2016-06-01, 10:34:26 GMT* ``DerivedGeneralCategory-10.0.0.txt`` *Date: 2017-03-08, 08:41:49 GMT* ``DerivedGeneralCategory-11.0.0.txt`` *Date: 2018-02-21, 05:34:04 GMT* ``DerivedGeneralCategory-12.0.0.txt`` *Date: 2019-01-22, 08:18:28 GMT* ``DerivedGeneralCategory-12.1.0.txt`` *Date: 2019-03-10, 10:53:08 GMT* ``DerivedGeneralCategory-13.0.0.txt`` *Date: 2019-10-21, 14:30:32 GMT* ``EastAsianWidth-4.1.0.txt`` *Date: 2005-03-17, 15:21:00 PST [KW]* ``EastAsianWidth-5.0.0.txt`` *Date: 2006-02-15, 14:39:00 PST [KW]* ``EastAsianWidth-5.1.0.txt`` *Date: 2008-03-20, 17:42:00 PDT [KW]* ``EastAsianWidth-5.2.0.txt`` *Date: 2009-06-09, 17:47:00 PDT [KW]* ``EastAsianWidth-6.0.0.txt`` *Date: 2010-08-17, 12:17:00 PDT [KW]* ``EastAsianWidth-6.1.0.txt`` *Date: 2011-09-19, 18:46:00 GMT [KW]* ``EastAsianWidth-6.2.0.txt`` *Date: 2012-05-15, 18:30:00 GMT [KW]* ``EastAsianWidth-6.3.0.txt`` *Date: 2013-02-05, 20:09:00 GMT [KW, LI]* ``EastAsianWidth-7.0.0.txt`` *Date: 2014-02-28, 23:15:00 GMT [KW, LI]* ``EastAsianWidth-8.0.0.txt`` *Date: 2015-02-10, 21:00:00 GMT [KW, LI]* ``EastAsianWidth-9.0.0.txt`` *Date: 2016-05-27, 17:00:00 GMT [KW, LI]* ``EastAsianWidth-10.0.0.txt`` *Date: 2017-03-08, 02:00:00 GMT [KW, LI]* ``EastAsianWidth-11.0.0.txt`` *Date: 2018-05-14, 09:41:59 GMT [KW, LI]* ``EastAsianWidth-12.0.0.txt`` *Date: 2019-01-21, 14:12:58 GMT [KW, LI]* ``EastAsianWidth-12.1.0.txt`` *Date: 2019-03-31, 22:01:58 GMT [KW, LI]* ``EastAsianWidth-13.0.0.txt`` *Date: 2029-01-21, 18:14:00 GMT [KW, LI]* wcwidth-0.2.5/docs/index.rst0000644000175000017500000000023613674424426014477 0ustar zigozigowcwidth ======= .. toctree:: intro unicode_version api Indices and tables ------------------ * :ref:`genindex` * :ref:`modindex` * :ref:`search` wcwidth-0.2.5/setup.py0000755000175000017500000000546513674424426013434 0ustar zigozigo#!/usr/bin/env python """ Setup.py distribution file for wcwidth. https://github.com/jquast/wcwidth """ # std imports import os import codecs # 3rd party import setuptools def _get_here(fname): return os.path.join(os.path.dirname(__file__), fname) class _SetupUpdate(setuptools.Command): # This is a compatibility, some downstream distributions might # still call "setup.py update". # # New entry point is tox, 'tox -eupdate'. description = "Fetch and update unicode code tables" user_options = [] def initialize_options(self): pass def finalize_options(self): pass def run(self): import sys import subprocess retcode = subprocess.Popen([ sys.executable, _get_here(os.path.join('bin', 'update-tables.py'))]).wait() assert retcode == 0, ('non-zero exit code', retcode) def main(): """Setup.py entry point.""" setuptools.setup( name='wcwidth', # NOTE: manually manage __version__ in wcwidth/__init__.py ! version='0.2.5', description=( "Measures the displayed width of unicode strings in a terminal"), long_description=codecs.open( _get_here('README.rst'), 'rb', 'utf8').read(), author='Jeff Quast', author_email='contact@jeffquast.com', install_requires=('backports.functools-lru-cache>=1.2.1;' 'python_version < "3.2"'), license='MIT', packages=['wcwidth'], url='https://github.com/jquast/wcwidth', package_data={ 'wcwidth': ['*.json'], '': ['LICENSE', '*.rst'], }, zip_safe=True, classifiers=[ 'Intended Audience :: Developers', 'Natural Language :: English', 'Development Status :: 5 - Production/Stable', 'Environment :: Console', 'License :: OSI Approved :: MIT License', 'Operating System :: POSIX', 'Programming Language :: Python :: 2.7', 'Programming Language :: Python :: 3.5', 'Programming Language :: Python :: 3.6', 'Programming Language :: Python :: 3.7', 'Programming Language :: Python :: 3.8', 'Topic :: Software Development :: Libraries', 'Topic :: Software Development :: Localization', 'Topic :: Software Development :: Internationalization', 'Topic :: Terminals' ], keywords=[ 'cjk', 'combining', 'console', 'eastasian', 'emoji' 'emulator', 'terminal', 'unicode', 'wcswidth', 'wcwidth', 'xterm', ], cmdclass={'update': _SetupUpdate}, ) if __name__ == '__main__': main() wcwidth-0.2.5/wcwidth/0000755000175000017500000000000014174462057013354 5ustar zigozigowcwidth-0.2.5/wcwidth/wcwidth.py0000644000175000017500000003503013674424426015402 0ustar zigozigo""" This is a python implementation of wcwidth() and wcswidth(). https://github.com/jquast/wcwidth from Markus Kuhn's C code, retrieved from: http://www.cl.cam.ac.uk/~mgk25/ucs/wcwidth.c This is an implementation of wcwidth() and wcswidth() (defined in IEEE Std 1002.1-2001) for Unicode. http://www.opengroup.org/onlinepubs/007904975/functions/wcwidth.html http://www.opengroup.org/onlinepubs/007904975/functions/wcswidth.html In fixed-width output devices, Latin characters all occupy a single "cell" position of equal width, whereas ideographic CJK characters occupy two such cells. Interoperability between terminal-line applications and (teletype-style) character terminals using the UTF-8 encoding requires agreement on which character should advance the cursor by how many cell positions. No established formal standards exist at present on which Unicode character shall occupy how many cell positions on character terminals. These routines are a first attempt of defining such behavior based on simple rules applied to data provided by the Unicode Consortium. For some graphical characters, the Unicode standard explicitly defines a character-cell width via the definition of the East Asian FullWidth (F), Wide (W), Half-width (H), and Narrow (Na) classes. In all these cases, there is no ambiguity about which width a terminal shall use. For characters in the East Asian Ambiguous (A) class, the width choice depends purely on a preference of backward compatibility with either historic CJK or Western practice. Choosing single-width for these characters is easy to justify as the appropriate long-term solution, as the CJK practice of displaying these characters as double-width comes from historic implementation simplicity (8-bit encoded characters were displayed single-width and 16-bit ones double-width, even for Greek, Cyrillic, etc.) and not any typographic considerations. Much less clear is the choice of width for the Not East Asian (Neutral) class. Existing practice does not dictate a width for any of these characters. It would nevertheless make sense typographically to allocate two character cells to characters such as for instance EM SPACE or VOLUME INTEGRAL, which cannot be represented adequately with a single-width glyph. The following routines at present merely assign a single-cell width to all neutral characters, in the interest of simplicity. This is not entirely satisfactory and should be reconsidered before establishing a formal standard in this area. At the moment, the decision which Not East Asian (Neutral) characters should be represented by double-width glyphs cannot yet be answered by applying a simple rule from the Unicode database content. Setting up a proper standard for the behavior of UTF-8 character terminals will require a careful analysis not only of each Unicode character, but also of each presentation form, something the author of these routines has avoided to do so far. http://www.unicode.org/unicode/reports/tr11/ Latest version: http://www.cl.cam.ac.uk/~mgk25/ucs/wcwidth.c """ from __future__ import division # std imports import os import sys import warnings # local from .table_wide import WIDE_EASTASIAN from .table_zero import ZERO_WIDTH from .unicode_versions import list_versions try: from functools import lru_cache except ImportError: # lru_cache was added in Python 3.2 from backports.functools_lru_cache import lru_cache # global cache _UNICODE_CMPTABLE = None _PY3 = (sys.version_info[0] >= 3) # NOTE: created by hand, there isn't anything identifiable other than # general Cf category code to identify these, and some characters in Cf # category code are of non-zero width. # Also includes some Cc, Mn, Zl, and Zp characters ZERO_WIDTH_CF = set([ 0, # Null (Cc) 0x034F, # Combining grapheme joiner (Mn) 0x200B, # Zero width space 0x200C, # Zero width non-joiner 0x200D, # Zero width joiner 0x200E, # Left-to-right mark 0x200F, # Right-to-left mark 0x2028, # Line separator (Zl) 0x2029, # Paragraph separator (Zp) 0x202A, # Left-to-right embedding 0x202B, # Right-to-left embedding 0x202C, # Pop directional formatting 0x202D, # Left-to-right override 0x202E, # Right-to-left override 0x2060, # Word joiner 0x2061, # Function application 0x2062, # Invisible times 0x2063, # Invisible separator ]) def _bisearch(ucs, table): """ Auxiliary function for binary search in interval table. :arg int ucs: Ordinal value of unicode character. :arg list table: List of starting and ending ranges of ordinal values, in form of ``[(start, end), ...]``. :rtype: int :returns: 1 if ordinal value ucs is found within lookup table, else 0. """ lbound = 0 ubound = len(table) - 1 if ucs < table[0][0] or ucs > table[ubound][1]: return 0 while ubound >= lbound: mid = (lbound + ubound) // 2 if ucs > table[mid][1]: lbound = mid + 1 elif ucs < table[mid][0]: ubound = mid - 1 else: return 1 return 0 @lru_cache(maxsize=1000) def wcwidth(wc, unicode_version='auto'): r""" Given one Unicode character, return its printable length on a terminal. :param str wc: A single Unicode character. :param str unicode_version: A Unicode version number, such as ``'6.0.0'``, the list of available version levels may be listed by pairing function :func:`list_versions`. Any version string may be specified without error -- the nearest matching version is selected. When ``latest`` (default), the highest Unicode version level is used. :return: The width, in cells, necessary to display the character of Unicode string character, ``wc``. Returns 0 if the ``wc`` argument has no printable effect on a terminal (such as NUL '\0'), -1 if ``wc`` is not printable, or has an indeterminate effect on the terminal, such as a control character. Otherwise, the number of column positions the character occupies on a graphic terminal (1 or 2) is returned. :rtype: int The following have a column width of -1: - C0 control characters (U+001 through U+01F). - C1 control characters and DEL (U+07F through U+0A0). The following have a column width of 0: - Non-spacing and enclosing combining characters (general category code Mn or Me in the Unicode database). - NULL (``U+0000``). - COMBINING GRAPHEME JOINER (``U+034F``). - ZERO WIDTH SPACE (``U+200B``) *through* RIGHT-TO-LEFT MARK (``U+200F``). - LINE SEPARATOR (``U+2028``) *and* PARAGRAPH SEPARATOR (``U+2029``). - LEFT-TO-RIGHT EMBEDDING (``U+202A``) *through* RIGHT-TO-LEFT OVERRIDE (``U+202E``). - WORD JOINER (``U+2060``) *through* INVISIBLE SEPARATOR (``U+2063``). The following have a column width of 1: - SOFT HYPHEN (``U+00AD``). - All remaining characters, including all printable ISO 8859-1 and WGL4 characters, Unicode control characters, etc. The following have a column width of 2: - Spacing characters in the East Asian Wide (W) or East Asian Full-width (F) category as defined in Unicode Technical Report #11 have a column width of 2. - Some kinds of Emoji or symbols. """ # NOTE: created by hand, there isn't anything identifiable other than # general Cf category code to identify these, and some characters in Cf # category code are of non-zero width. ucs = ord(wc) if ucs in ZERO_WIDTH_CF: return 0 # C0/C1 control characters if ucs < 32 or 0x07F <= ucs < 0x0A0: return -1 _unicode_version = _wcmatch_version(unicode_version) # combining characters with zero width if _bisearch(ucs, ZERO_WIDTH[_unicode_version]): return 0 return 1 + _bisearch(ucs, WIDE_EASTASIAN[_unicode_version]) def wcswidth(pwcs, n=None, unicode_version='auto'): """ Given a unicode string, return its printable length on a terminal. :param str pwcs: Measure width of given unicode string. :param int n: When ``n`` is None (default), return the length of the entire string, otherwise width the first ``n`` characters specified. :param str unicode_version: An explicit definition of the unicode version level to use for determination, may be ``auto`` (default), which uses the Environment Variable, ``UNICODE_VERSION`` if defined, or the latest available unicode version, otherwise. :rtype: int :returns: The width, in cells, necessary to display the first ``n`` characters of the unicode string ``pwcs``. Returns ``-1`` if a non-printable character is encountered. """ # pylint: disable=C0103 # Invalid argument name "n" end = len(pwcs) if n is None else n idx = slice(0, end) width = 0 for char in pwcs[idx]: wcw = wcwidth(char, unicode_version) if wcw < 0: return -1 width += wcw return width @lru_cache(maxsize=128) def _wcversion_value(ver_string): """ Integer-mapped value of given dotted version string. :param str ver_string: Unicode version string, of form ``n.n.n``. :rtype: tuple(int) :returns: tuple of digit tuples, ``tuple(int, [...])``. """ retval = tuple(map(int, (ver_string.split('.')))) return retval @lru_cache(maxsize=8) def _wcmatch_version(given_version): """ Return nearest matching supported Unicode version level. If an exact match is not determined, the nearest lowest version level is returned after a warning is emitted. For example, given supported levels ``4.1.0`` and ``5.0.0``, and a version string of ``4.9.9``, then ``4.1.0`` is selected and returned: >>> _wcmatch_version('4.9.9') '4.1.0' >>> _wcmatch_version('8.0') '8.0.0' >>> _wcmatch_version('1') '4.1.0' :param str given_version: given version for compare, may be ``auto`` (default), to select Unicode Version from Environment Variable, ``UNICODE_VERSION``. If the environment variable is not set, then the latest is used. :rtype: str :returns: unicode string, or non-unicode ``str`` type for python 2 when given ``version`` is also type ``str``. """ # Design note: the choice to return the same type that is given certainly # complicates it for python 2 str-type, but allows us to define an api that # to use 'string-type', for unicode version level definitions, so all of our # example code works with all versions of python. That, along with the # string-to-numeric and comparisons of earliest, latest, matching, or # nearest, greatly complicates this function. _return_str = not _PY3 and isinstance(given_version, str) if _return_str: unicode_versions = [ucs.encode() for ucs in list_versions()] else: unicode_versions = list_versions() latest_version = unicode_versions[-1] if given_version in (u'auto', 'auto'): given_version = os.environ.get( 'UNICODE_VERSION', 'latest' if not _return_str else latest_version.encode()) if given_version in (u'latest', 'latest'): # default match, when given as 'latest', use the most latest unicode # version specification level supported. return latest_version if not _return_str else latest_version.encode() if given_version in unicode_versions: # exact match, downstream has specified an explicit matching version # matching any value of list_versions(). return given_version if not _return_str else given_version.encode() # The user's version is not supported by ours. We return the newest unicode # version level that we support below their given value. try: cmp_given = _wcversion_value(given_version) except ValueError: # submitted value raises ValueError in int(), warn and use latest. warnings.warn("UNICODE_VERSION value, {given_version!r}, is invalid. " "Value should be in form of `integer[.]+', the latest " "supported unicode version {latest_version!r} has been " "inferred.".format(given_version=given_version, latest_version=latest_version)) return latest_version if not _return_str else latest_version.encode() # given version is less than any available version, return earliest # version. earliest_version = unicode_versions[0] cmp_earliest_version = _wcversion_value(earliest_version) if cmp_given <= cmp_earliest_version: # this probably isn't what you wanted, the oldest wcwidth.c you will # find in the wild is likely version 5 or 6, which we both support, # but it's better than not saying anything at all. warnings.warn("UNICODE_VERSION value, {given_version!r}, is lower " "than any available unicode version. Returning lowest " "version level, {earliest_version!r}".format( given_version=given_version, earliest_version=earliest_version)) return earliest_version if not _return_str else earliest_version.encode() # create list of versions which are less than our equal to given version, # and return the tail value, which is the highest level we may support, # or the latest value we support, when completely unmatched or higher # than any supported version. # # function will never complete, always returns. for idx, unicode_version in enumerate(unicode_versions): # look ahead to next value try: cmp_next_version = _wcversion_value(unicode_versions[idx + 1]) except IndexError: # at end of list, return latest version return latest_version if not _return_str else latest_version.encode() # Maybe our given version has less parts, as in tuple(8, 0), than the # next compare version tuple(8, 0, 0). Test for an exact match by # comparison of only the leading dotted piece(s): (8, 0) == (8, 0). if cmp_given == cmp_next_version[:len(cmp_given)]: return unicode_versions[idx + 1] # Or, if any next value is greater than our given support level # version, return the current value in index. Even though it must # be less than the given value, its our closest possible match. That # is, 4.1 is returned for given 4.9.9, where 4.1 and 5.0 are available. if cmp_next_version > cmp_given: return unicode_version assert False, ("Code path unreachable", given_version, unicode_versions) wcwidth-0.2.5/wcwidth/__init__.py0000644000175000017500000000302513674424426015467 0ustar zigozigo""" wcwidth module. https://github.com/jquast/wcwidth """ # re-export all functions & definitions, even private ones, from top-level # module path, to allow for 'from wcwidth import _private_func'. Of course, # user beware that any _private function may disappear or change signature at # any future version. # local from .wcwidth import ZERO_WIDTH # noqa from .wcwidth import (WIDE_EASTASIAN, wcwidth, wcswidth, _bisearch, list_versions, _wcmatch_version, _wcversion_value) # The __all__ attribute defines the items exported from statement, # 'from wcwidth import *', but also to say, "This is the public API". __all__ = ('wcwidth', 'wcswidth', 'list_versions') # I used to use a _get_package_version() function to use the `pkg_resources' # module to parse the package version from our version.json file, but this blew # some folks up, or more particularly, just the `xonsh' shell. # # Yikes! I always wanted to like xonsh and tried it many times but issues like # these always bit me, too, so I can sympathize -- this version is now manually # kept in sync with version.json to help them out. Shucks, this variable is just # for legacy, from the days before 'pip freeze' was a thing. # # We also used pkg_resources to load unicode version tables from version.json, # generated by bin/update-tables.py, but some environments are unable to # import pkg_resources for one reason or another, yikes! __version__ = '0.2.5' wcwidth-0.2.5/wcwidth/unicode_versions.py0000644000175000017500000000143013674424426017304 0ustar zigozigo""" Exports function list_versions() for unicode version level support. This code generated by bin/update-tables.py on 2020-06-23 15:58:44.035540. """ def list_versions(): """ Return Unicode version levels supported by this module release. Any of the version strings returned may be used as keyword argument ``unicode_version`` to the ``wcwidth()`` family of functions. :returns: Supported Unicode version numbers in ascending sorted order. :rtype: list[str] """ return ( "4.1.0", "5.0.0", "5.1.0", "5.2.0", "6.0.0", "6.1.0", "6.2.0", "6.3.0", "7.0.0", "8.0.0", "9.0.0", "10.0.0", "11.0.0", "12.0.0", "12.1.0", "13.0.0", ) wcwidth-0.2.5/requirements-develop.txt0000644000175000017500000000004113674424426016620 0ustar zigozigoblessed>=1.14.1,<2 docopt==0.6.2 wcwidth-0.2.5/.pylintrc0000644000175000017500000000205513674424426013554 0ustar zigozigo[MASTER] load-plugins= pylint.extensions.mccabe, pylint.extensions.check_elif, pylint.extensions.docparams, pylint.extensions.overlapping_exceptions, pylint.extensions.redefined_variable_type persistent = no jobs = 0 unsafe-load-any-extension = yes good-names = wc,fp [MESSAGES CONTROL] disable= I, fixme, c-extension-no-member, ungrouped-imports, useless-object-inheritance, missing-yield-type-doc, missing-yield-doc, too-many-lines, inconsistent-return-statements, too-many-return-statements, too-many-boolean-expressions [FORMAT] max-line-length: 100 [PARAMETER_DOCUMENTATION] default-docstring-type=sphinx accept-no-raise-doc=no accept-no-param-doc=yes accept-no-return-doc=yes [DESIGN] max-args=10 max-attributes=7 max-branches=12 max-complexity=13 max-locals=15 max-module-lines=1300 max-parents=7 max-public-methods=20 max-returns=6 max-statements=50 [SIMILARITIES] ignore-imports=yes min-similarity-lines=8 [REPORTS] reports=no msg-template={path}:{line}: [{msg_id}({symbol}), {obj}] {msg} wcwidth-0.2.5/bin/0000755000175000017500000000000013674424426012455 5ustar zigozigowcwidth-0.2.5/bin/run_codecov.py0000644000175000017500000000165213674424426015341 0ustar zigozigo"""Workaround for https://github.com/codecov/codecov-python/issues/158.""" # std imports import sys import time # 3rd party import codecov RETRIES = 5 TIMEOUT = 2 def main(): """Run codecov up to RETRIES times On the final attempt, let it exit normally.""" # Make a copy of argv and make sure --required is in it args = sys.argv[1:] if '--required' not in args: args.append('--required') for num in range(1, RETRIES + 1): print('Running codecov attempt %d: ' % num) # On the last, let codecov handle the exit if num == RETRIES: codecov.main() try: codecov.main(*args) except SystemExit as err: # If there's no exit code, it was successful if err.code: time.sleep(TIMEOUT) else: sys.exit(err.code) else: break if __name__ == '__main__': main() wcwidth-0.2.5/bin/update-tables.py0000644000175000017500000002544413674424426015572 0ustar zigozigo#!/usr/bin/env python """ Update the python Unicode tables for wcwidth. https://github.com/jquast/wcwidth """ from __future__ import print_function # std imports import os import re import glob import json import codecs import string import urllib import datetime import collections import unicodedata try: # py2 from urllib2 import urlopen except ImportError: # py3 from urllib.request import urlopen URL_UNICODE_DERIVED_AGE = 'http://www.unicode.org/Public/UCD/latest/ucd/DerivedAge.txt' EXCLUDE_VERSIONS = ['2.0.0', '2.1.2', '3.0.0', '3.1.0', '3.2.0', '4.0.0'] PATH_UP = os.path.relpath( os.path.join( os.path.dirname(__file__), os.path.pardir)) PATH_DOCS = os.path.join(PATH_UP, 'docs') PATH_DATA = os.path.join(PATH_UP, 'data') PATH_CODE = os.path.join(PATH_UP, 'wcwidth') FILE_RST = os.path.join(PATH_DOCS, 'unicode_version.rst') FILE_PATCH_FROM = "release files:" FILE_PATCH_TO = "=======" # use chr() for py3.x, # unichr() for py2.x try: _ = unichr(0) except NameError as err: if err.args[0] == "name 'unichr' is not defined": # pylint: disable=C0103,W0622 # Invalid constant name "unichr" (col 8) # Redefining built-in 'unichr' (col 8) unichr = chr else: raise TableDef = collections.namedtuple('table', ['version', 'date', 'values']) def main(): """Update east-asian, combining and zero width tables.""" versions = get_unicode_versions() do_east_asian(versions) do_zero_width(versions) do_rst_file_update() do_unicode_versions(versions) def get_unicode_versions(): """Fetch, determine, and return Unicode Versions for processing.""" fname = os.path.join(PATH_DATA, 'DerivedAge.txt') do_retrieve(url=URL_UNICODE_DERIVED_AGE, fname=fname) pattern = re.compile(r'#.*assigned in Unicode ([0-9.]+)') versions = [] for line in open(fname, 'r'): if match := re.match(pattern, line): version = match.group(1) if version not in EXCLUDE_VERSIONS: versions.append(version) versions.sort(key=lambda ver: list(map(int, ver.split('.')))) return versions def do_rst_file_update(): """Patch unicode_versions.rst to reflect the data files used in release.""" # read in, data_in = codecs.open(FILE_RST, 'r', 'utf8').read() # search for beginning and end positions, pos_begin = data_in.find(FILE_PATCH_FROM) assert pos_begin != -1, (pos_begin, FILE_PATCH_FROM) pos_begin += len(FILE_PATCH_FROM) data_out = data_in[:pos_begin] + '\n\n' # find all filenames with a version number in it, # sort filenames by name, then dotted number, ascending glob_pattern = os.path.join(PATH_DATA, '*[0-9]*.txt') filenames = glob.glob(glob_pattern) filenames.sort(key=lambda ver: [ver.split( '-')[0]] + list(map(int, ver.split('-')[-1][:-4].split('.')))) # copy file description as-is, formatted for fpath in filenames: if description := describe_file_header(fpath): data_out += f'\n{description}' # write. print(f"patching {FILE_RST} ..") codecs.open( FILE_RST, 'w', 'utf8').write(data_out) def do_east_asian(versions): """Fetch and update east-asian tables.""" table = {} for version in versions: fin = os.path.join(PATH_DATA, 'EastAsianWidth-{version}.txt') fout = os.path.join(PATH_CODE, 'table_wide.py') url = ('http://www.unicode.org/Public/{version}/' 'ucd/EastAsianWidth.txt') try: do_retrieve(url=url.format(version=version), fname=fin.format(version=version)) except urllib.error.HTTPError as err: if err.code != 404: raise else: table[version] = parse_east_asian( fname=fin.format(version=version), properties=(u'W', u'F',)) do_write_table(fname=fout, variable='WIDE_EASTASIAN', table=table) def do_zero_width(versions): """Fetch and update zero width tables.""" table = {} fout = os.path.join(PATH_CODE, 'table_zero.py') for version in versions: fin = os.path.join(PATH_DATA, 'DerivedGeneralCategory-{version}.txt') url = ('http://www.unicode.org/Public/{version}/ucd/extracted/' 'DerivedGeneralCategory.txt') try: do_retrieve(url=url.format(version=version), fname=fin.format(version=version)) except urllib.error.HTTPError as err: if err.code != 404: raise else: table[version] = parse_category( fname=fin.format(version=version), categories=('Me', 'Mn',)) do_write_table(fname=fout, variable='ZERO_WIDTH', table=table) def make_table(values): """Return a tuple of lookup tables for given values.""" table = collections.deque() start, end = values[0], values[0] for num, value in enumerate(values): if num == 0: table.append((value, value,)) continue start, end = table.pop() if end == value - 1: table.append((start, value,)) else: table.append((start, end,)) table.append((value, value,)) return tuple(table) def do_retrieve(url, fname): """Retrieve given url to target filepath fname.""" folder = os.path.dirname(fname) if not os.path.exists(folder): os.makedirs(folder) print(f"{folder}{os.path.sep} created.") if not os.path.exists(fname): try: with open(fname, 'wb') as fout: print(f"retrieving {url}: ", end='', flush=True) resp = urlopen(url) fout.write(resp.read()) except BaseException: print('failed') os.unlink(fname) raise print(f"{fname} saved.") return fname def describe_file_header(fpath): header_2 = [line.lstrip('# ').rstrip() for line in codecs.open(fpath, 'r', 'utf8').readlines()[:2]] # fmt: # # ``EastAsianWidth-8.0.0.txt`` # *2015-02-10, 21:00:00 GMT [KW, LI]* fmt = '``{0}``\n *{1}*\n' if len(header_2) == 0: return '' assert len(header_2) == 2, (fpath, header_2) return fmt.format(*header_2) def parse_east_asian(fname, properties=(u'W', u'F',)): """Parse unicode east-asian width tables.""" print(f'parsing {fname}: ', end='', flush=True) version, date, values = None, None, [] for line in open(fname, 'rb'): uline = line.decode('utf-8') if version is None: version = uline.split(None, 1)[1].rstrip() continue if date is None: date = uline.split(':', 1)[1].rstrip() continue if uline.startswith('#') or not uline.lstrip(): continue addrs, details = uline.split(';', 1) if any(details.startswith(property) for property in properties): start, stop = addrs, addrs if '..' in addrs: start, stop = addrs.split('..') values.extend(range(int(start, 16), int(stop, 16) + 1)) print('ok') return TableDef(version, date, values) def parse_category(fname, categories): """Parse unicode category tables.""" print(f'parsing {fname}: ', end='', flush=True) version, date, values = None, None, [] for line in open(fname, 'rb'): uline = line.decode('utf-8') if version is None: version = uline.split(None, 1)[1].rstrip() continue if date is None: date = uline.split(':', 1)[1].rstrip() continue if uline.startswith('#') or not uline.lstrip(): continue addrs, details = uline.split(';', 1) addrs, details = addrs.rstrip(), details.lstrip() if any(details.startswith(f'{value} #') for value in categories): start, stop = addrs, addrs if '..' in addrs: start, stop = addrs.split('..') values.extend(range(int(start, 16), int(stop, 16) + 1)) print('ok') return TableDef(version, date, sorted(values)) def do_write_table(fname, variable, table): """Write combining tables to filesystem as python code.""" # pylint: disable=R0914 # Too many local variables (19/15) (col 4) utc_now = datetime.datetime.utcnow() indent = ' ' * 8 with open(fname, 'w') as fout: print(f"writing {fname} ... ", end='') fout.write( f'"""{variable.title()} table, created by bin/update-tables.py."""\n' f"# Generated: {utc_now.isoformat()}\n" f"{variable} = {{\n") for version_key, version_table in table.items(): if not version_table.values: continue fout.write( f"{indent[:-4]}'{version_key}': (\n" f"{indent}# Source: {version_table.version}\n" f"{indent}# Date: {version_table.date}\n" f"{indent}#") for start, end in make_table(version_table.values): ucs_start, ucs_end = unichr(start), unichr(end) hex_start, hex_end = (f'0x{start:05x}', f'0x{end:05x}') try: name_start = string.capwords(unicodedata.name(ucs_start)) except ValueError: name_start = u'(nil)' try: name_end = string.capwords(unicodedata.name(ucs_end)) except ValueError: name_end = u'(nil)' fout.write(f'\n{indent}') comment_startpart = name_start[:24].rstrip() comment_endpart = name_end[:24].rstrip() fout.write(f'({hex_start}, {hex_end},),') fout.write(f' # {comment_startpart:24s}..{comment_endpart}') fout.write(f'\n{indent[:-4]}),\n') fout.write('}\n') print("complete.") def do_unicode_versions(versions): """Write unicode_versions.py function list_versions().""" fname = os.path.join(PATH_CODE, 'unicode_versions.py') print(f"writing {fname} ... ", end='') utc_now = datetime.datetime.utcnow() version_tuples_str = '\n '.join( f'"{ver}",' for ver in versions) with open(fname, 'w') as fp: fp.write(f"""\"\"\" Exports function list_versions() for unicode version level support. This code generated by {__file__} on {utc_now}. \"\"\" def list_versions(): \"\"\" Return Unicode version levels supported by this module release. Any of the version strings returned may be used as keyword argument ``unicode_version`` to the ``wcwidth()`` family of functions. :returns: Supported Unicode version numbers in ascending sorted order. :rtype: list[str] \"\"\" return ( {version_tuples_str} ) """) print('done.') if __name__ == '__main__': main() wcwidth-0.2.5/bin/new-wide-by-version.py0000755000175000017500000000254713674424426016654 0ustar zigozigo#!/usr/bin/env python3 """ Display new wide unicode point values, by version. For example:: "5.0.0": [ 12752, 12753, 12754, ... Means that chr(12752) through chr(12754) are new WIDE values for Unicode vesion 5.0.0, and were not WIDE values for the previous version (4.1.0). """ # std imports import sys import json # List new WIDE characters at each unicode version. # def main(): from wcwidth import WIDE_EASTASIAN, _bisearch versions = list(WIDE_EASTASIAN.keys()) results = {} for version in versions: prev_idx = versions.index(version) - 1 if prev_idx == -1: continue previous_version = versions[prev_idx] previous_table = WIDE_EASTASIAN[previous_version] for value_pair in WIDE_EASTASIAN[version]: for value in range(*value_pair): if not _bisearch(value, previous_table): results[version] = results.get(version, []) + [value] if '--debug' in sys.argv: print(f'version {version} has unicode character ' f'0x{value:05x} ({chr(value)}) but previous ' f'version, {previous_version} does not.', file=sys.stderr) print(json.dumps(results, indent=4)) if __name__ == '__main__': main() wcwidth-0.2.5/bin/wcwidth-libc-comparator.py0000755000175000017500000000745113674424426017566 0ustar zigozigo#!/usr/bin/env python # coding: utf-8 """ Manual tests comparing wcwidth.py to libc's wcwidth(3) and wcswidth(3). https://github.com/jquast/wcwidth This suite of tests compares the libc return values with the pure-python return values. Although wcwidth(3) is POSIX, its actual implementation may differ, so these tests are not guaranteed to be successful on all platforms, especially where wcwidth(3)/wcswidth(3) is out of date. This is especially true for many platforms -- usually conforming only to unicode specification 1.0 or 2.0. This program accepts one optional command-line argument, the unicode version level for our library to use when comparing to libc. """ # pylint: disable=C0103 # Invalid module name "wcwidth-libc-comparator" # standard imports from __future__ import print_function # std imports import sys import locale import warnings import ctypes.util import unicodedata # local # local imports import wcwidth def is_named(ucs): """ Whether the unicode point ``ucs`` has a name. :rtype bool """ try: return bool(unicodedata.name(ucs)) except ValueError: return False def is_not_combining(ucs): return not unicodedata.combining(ucs) def report_ucs_msg(ucs, wcwidth_libc, wcwidth_local): """ Return string report of combining character differences. :param ucs: unicode point. :type ucs: unicode :param wcwidth_libc: libc-wcwidth's reported character length. :type comb_py: int :param wcwidth_local: wcwidth's reported character length. :type comb_wc: int :rtype: unicode """ ucp = (ucs.encode('unicode_escape')[2:] .decode('ascii') .upper() .lstrip('0')) url = "http://codepoints.net/U+{}".format(ucp) name = unicodedata.name(ucs) return (u"libc,ours={},{} [--o{}o--] name={} val={} {}" " ".format(wcwidth_libc, wcwidth_local, ucs, name, ord(ucs), url)) # use chr() for py3.x, # unichr() for py2.x try: _ = unichr(0) except NameError as err: if err.args[0] == "name 'unichr' is not defined": # pylint: disable=W0622 # Redefining built-in 'unichr' (col 8) unichr = chr else: raise if sys.maxunicode < 1114111: warnings.warn('narrow Python build, only a small subset of ' 'characters may be tested.') def _is_equal_wcwidth(libc, ucs, unicode_version): w_libc = libc.wcwidth(ucs) w_local = wcwidth.wcwidth(ucs, unicode_version) assert w_libc == w_local, report_ucs_msg(ucs, w_libc, w_local) def main(using_locale=('en_US', 'UTF-8',)): """ Program entry point. Load the entire Unicode table into memory, excluding those that: - are not named (func unicodedata.name returns empty string), - are combining characters. Using ``locale``, for each unicode character string compare libc's wcwidth with local wcwidth.wcwidth() function; when they differ, report a detailed AssertionError to stdout. """ all_ucs = (ucs for ucs in [unichr(val) for val in range(sys.maxunicode)] if is_named(ucs) and is_not_combining(ucs)) libc_name = ctypes.util.find_library('c') if not libc_name: raise ImportError("Can't find C library.") libc = ctypes.cdll.LoadLibrary(libc_name) libc.wcwidth.argtypes = [ctypes.c_wchar, ] libc.wcwidth.restype = ctypes.c_int assert getattr(libc, 'wcwidth', None) is not None assert getattr(libc, 'wcswidth', None) is not None locale.setlocale(locale.LC_ALL, using_locale) unicode_version = 'latest' if len(sys.argv) > 1: unicode_version = sys.argv[1] for ucs in all_ucs: try: _is_equal_wcwidth(libc, ucs, unicode_version) except AssertionError as err: print(err) if __name__ == '__main__': main() wcwidth-0.2.5/bin/wcwidth-browser.py0000755000175000017500000006044213674424426016172 0ustar zigozigo#!/usr/bin/env python """ A terminal browser, similar to less(1) for testing printable width of unicode. This displays the full range of unicode points for 1 or 2-character wide ideograms, with pipes ('|') that should always align for any terminal that supports utf-8. Usage: ./bin/wcwidth-browser.py [--wide=] [--alignment=] [--combining] [--help] Options: --wide= Browser 1 or 2 character-wide cells. --alignment= Chose left or right alignment. [default: left] --combining Use combining character generator. [default: 2] --help Display usage """ # pylint: disable=C0103,W0622 # Invalid constant name "echo" # Invalid constant name "flushout" (col 4) # Invalid module name "wcwidth-browser" from __future__ import division, print_function # std imports import sys import signal import string import functools import unicodedata # 3rd party import docopt import blessed # local from wcwidth import ZERO_WIDTH, wcwidth, list_versions, _wcmatch_version #: print function alias, does not end with line terminator. echo = functools.partial(print, end='') flushout = functools.partial(print, end='', flush=True) #: printable length of highest unicode character description LIMIT_UCS = 0x3fffd UCS_PRINTLEN = len('{value:0x}'.format(value=LIMIT_UCS)) def readline(term, width): """A rudimentary readline implementation.""" text = '' while True: inp = term.inkey() if inp.code == term.KEY_ENTER: break if inp.code == term.KEY_ESCAPE or inp == chr(3): text = None break if not inp.is_sequence and len(text) < width: text += inp echo(inp) flushout() elif inp.code in (term.KEY_BACKSPACE, term.KEY_DELETE): if text: text = text[:-1] echo('\b \b') flushout() return text class WcWideCharacterGenerator(object): """Generator yields unicode characters of the given ``width``.""" # pylint: disable=R0903 # Too few public methods (0/2) def __init__(self, width=2, unicode_version='auto'): """ Class constructor. :param width: generate characters of given width. :param str unicode_version: Unicode Version for render. :type width: int """ self.characters = ( chr(idx) for idx in range(LIMIT_UCS) if wcwidth(chr(idx), unicode_version=unicode_version) == width) def __iter__(self): """Special method called by iter().""" return self def __next__(self): """Special method called by next().""" while True: ucs = next(self.characters) try: name = string.capwords(unicodedata.name(ucs)) except ValueError: continue return (ucs, name) class WcCombinedCharacterGenerator(object): """Generator yields unicode characters with combining.""" # pylint: disable=R0903 # Too few public methods (0/2) def __init__(self, width=1): """ Class constructor. :param int width: generate characters of given width. :param str unicode_version: Unicode version. """ self.characters = [] letters_o = ('o' * width) last_version = list_versions()[-1] for (begin, end) in ZERO_WIDTH[last_version].items(): for val in [_val for _val in range(begin, end + 1) if _val <= LIMIT_UCS]: self.characters.append( letters_o[:1] + chr(val) + letters_o[wcwidth(chr(val)) + 1:]) self.characters.reverse() def __iter__(self): """Special method called by iter().""" return self def __next__(self): """ Special method called by next(). :return: unicode character and name, as tuple. :rtype: tuple[unicode, unicode] :raises StopIteration: no more characters """ while True: if not self.characters: raise StopIteration ucs = self.characters.pop() try: name = string.capwords(unicodedata.name(ucs[1])) except ValueError: continue return (ucs, name) # python 2.6 - 3.3 compatibility next = __next__ class Style(object): """Styling decorator class instance for terminal output.""" # pylint: disable=R0903 # Too few public methods (0/2) @staticmethod def attr_major(text): """non-stylized callable for "major" text, for non-ttys.""" return text @staticmethod def attr_minor(text): """non-stylized callable for "minor" text, for non-ttys.""" return text delimiter = '|' continuation = ' $' header_hint = '-' header_fill = '=' name_len = 10 alignment = 'right' def __init__(self, **kwargs): """ Class constructor. Any given keyword arguments are assigned to the class attribute of the same name. """ for key, val in kwargs.items(): setattr(self, key, val) class Screen(object): """Represents terminal style, data dimensions, and drawables.""" intro_msg_fmt = ('Delimiters ({delim}) should align, ' 'unicode version is {version}.') def __init__(self, term, style, wide=2): """Class constructor.""" self.term = term self.style = style self.wide = wide @property def header(self): """Text of joined segments producing full heading.""" return self.head_item * self.num_columns @property def hint_width(self): """Width of a column segment.""" return sum((len(self.style.delimiter), self.wide, len(self.style.delimiter), len(' '), UCS_PRINTLEN + 2, len(' '), self.style.name_len,)) @property def head_item(self): """Text of a single column heading.""" delimiter = self.style.attr_minor(self.style.delimiter) hint = self.style.header_hint * self.wide heading = ('{delimiter}{hint}{delimiter}' .format(delimiter=delimiter, hint=hint)) def alignment(*args): if self.style.alignment == 'right': return self.term.rjust(*args) return self.term.ljust(*args) txt = alignment(heading, self.hint_width, self.style.header_fill) return self.style.attr_major(txt) def msg_intro(self, version): """Introductory message disabled above heading.""" return self.term.center(self.intro_msg_fmt.format( delim=self.style.attr_minor(self.style.delimiter), version=self.style.attr_minor(version))).rstrip() @property def row_ends(self): """Bottom of page.""" return self.term.height - 1 @property def num_columns(self): """Number of columns displayed.""" if self.term.is_a_tty: return self.term.width // self.hint_width return 1 @property def num_rows(self): """Number of rows displayed.""" return self.row_ends - self.row_begins - 1 @property def row_begins(self): """Top row displayed for content.""" # pylint: disable=R0201 # Method could be a function (col 4) return 2 @property def page_size(self): """Number of unicode text displayed per page.""" return self.num_rows * self.num_columns class Pager(object): """A less(1)-like browser for browsing unicode characters.""" # pylint: disable=too-many-instance-attributes #: screen state for next draw method(s). STATE_CLEAN, STATE_DIRTY, STATE_REFRESH = 0, 1, 2 def __init__(self, term, screen, character_factory): """ Class constructor. :param term: blessed Terminal class instance. :type term: blessed.Terminal :param screen: Screen class instance. :type screen: Screen :param character_factory: Character factory generator. :type character_factory: callable returning iterable. """ self.term = term self.screen = screen self.character_factory = character_factory self.unicode_version = 'auto' self.dirty = self.STATE_REFRESH self.last_page = 0 self._page_data = list() def on_resize(self, *args): """Signal handler callback for SIGWINCH.""" # pylint: disable=W0613 # Unused argument 'args' assert self.term.width >= self.screen.hint_width, ( 'Screen to small {}, must be at least {}'.format( self.term.width, self.screen.hint_width)) self._set_lastpage() self.dirty = self.STATE_REFRESH def _set_lastpage(self): """Calculate value of class attribute ``last_page``.""" self.last_page = (len(self._page_data) - 1) // self.screen.page_size def display_initialize(self): """Display 'please wait' message, and narrow build warning.""" echo(self.term.home + self.term.clear) echo(self.term.move_y(self.term.height // 2)) echo(self.term.center('Initializing page data ...').rstrip()) flushout() def initialize_page_data(self): """Initialize the page data for the given screen.""" # pylint: disable=attribute-defined-outside-init if self.term.is_a_tty: self.display_initialize() self.character_generator = self.character_factory( self.screen.wide) self._page_data = list() while True: try: self._page_data.append(next(self.character_generator)) except StopIteration: break self._set_lastpage() def page_data(self, idx, offset): """ Return character data for page of given index and offset. :param idx: page index. :type idx: int :param offset: scrolling region offset of current page. :type offset: int :returns: list of tuples in form of ``(ucs, name)`` :rtype: list[(unicode, unicode)] """ size = self.screen.page_size while offset < 0 and idx: offset += size idx -= 1 offset = max(0, offset) while offset >= size: offset -= size idx += 1 if idx == self.last_page: offset = 0 idx = min(max(0, idx), self.last_page) start = (idx * self.screen.page_size) + offset end = start + self.screen.page_size return (idx, offset), self._page_data[start:end] def _run_notty(self, writer): """Pager run method for terminals that are not a tty.""" page_idx = page_offset = 0 while True: npage_idx, _ = self.draw(writer, page_idx + 1, page_offset) if npage_idx == self.last_page: # page displayed was last page, quit. break page_idx = npage_idx self.dirty = self.STATE_DIRTY def _run_tty(self, writer, reader): """Pager run method for terminals that are a tty.""" # allow window-change signal to reflow screen signal.signal(signal.SIGWINCH, self.on_resize) page_idx = page_offset = 0 while True: if self.dirty: page_idx, page_offset = self.draw(writer, page_idx, page_offset) self.dirty = self.STATE_CLEAN inp = reader(timeout=0.25) if inp is not None: nxt, noff = self.process_keystroke(inp, page_idx, page_offset) if self.dirty: continue if not self.dirty: self.dirty = nxt != page_idx or noff != page_offset page_idx, page_offset = nxt, noff if page_idx == -1: return def run(self, writer, reader): """ Pager entry point. In interactive mode (terminal is a tty), run until ``process_keystroke()`` detects quit keystroke ('q'). In non-interactive mode, exit after displaying all unicode points. :param writer: callable writes to output stream, receiving unicode. :type writer: callable :param reader: callable reads keystrokes from input stream, sending instance of blessed.keyboard.Keystroke. :type reader: callable """ self.initialize_page_data() if not self.term.is_a_tty: self._run_notty(writer) else: self._run_tty(writer, reader) def process_keystroke(self, inp, idx, offset): """ Process keystroke ``inp``, adjusting screen parameters. :param inp: return value of blessed.Terminal.inkey(). :type inp: blessed.keyboard.Keystroke :param idx: page index. :type idx: int :param offset: scrolling region offset of current page. :type offset: int :returns: tuple of next (idx, offset). :rtype: (int, int) """ if inp.lower() in ('q', 'Q'): # exit return (-1, -1) self._process_keystroke_commands(inp) idx, offset = self._process_keystroke_movement(inp, idx, offset) return idx, offset def _process_keystroke_commands(self, inp): """Process keystrokes that issue commands (side effects).""" if inp in ('1', '2') and self.screen.wide != int(inp): # change between 1 or 2-character wide mode. self.screen.wide = int(inp) self.initialize_page_data() self.on_resize(None, None) elif inp == 'c': # switch on/off combining characters self.character_factory = ( WcWideCharacterGenerator if self.character_factory != WcWideCharacterGenerator else WcCombinedCharacterGenerator) self.initialize_page_data() self.on_resize(None, None) elif inp in ('_', '-'): # adjust name length -2 nlen = max(1, self.screen.style.name_len - 2) if nlen != self.screen.style.name_len: self.screen.style.name_len = nlen self.on_resize(None, None) elif inp in ('+', '='): # adjust name length +2 nlen = min(self.term.width - 8, self.screen.style.name_len + 2) if nlen != self.screen.style.name_len: self.screen.style.name_len = nlen self.on_resize(None, None) elif inp == 'v': with self.term.location(x=0, y=self.term.height - 2): print(self.term.clear_eos()) input_selection_msg = ( "--> Enter unicode version [{versions}] (" "current: {self.unicode_version}):".format( versions=', '.join(list_versions()), self=self)) echo('\n'.join(self.term.wrap(input_selection_msg, subsequent_indent=' '))) echo(' ') flushout() inp = readline(self.term, width=max(map(len, list_versions()))) if inp.strip() and inp != self.unicode_version: # set new unicode version -- page data must be # re-initialized. Any version is legal, underlying # library performs best-match (with warnings) self.unicode_version = _wcmatch_version(inp) self.initialize_page_data() self.on_resize(None, None) def _process_keystroke_movement(self, inp, idx, offset): """Process keystrokes that adjust index and offset.""" term = self.term # a little vi-inspired. if inp in ('y', 'k') or inp.code in (term.KEY_UP,): # scroll backward 1 line offset -= self.screen.num_columns elif inp in ('e', 'j') or inp.code in (term.KEY_ENTER, term.KEY_DOWN,): # scroll forward 1 line offset = offset + self.screen.num_columns elif inp in ('f', ' ') or inp.code in (term.KEY_PGDOWN,): # scroll forward 1 page idx += 1 elif inp == 'b' or inp.code in (term.KEY_PGUP,): # scroll backward 1 page idx = max(0, idx - 1) elif inp == 'F' or inp.code in (term.KEY_SDOWN,): # scroll forward 10 pages idx = max(0, idx + 10) elif inp == 'B' or inp.code in (term.KEY_SUP,): # scroll backward 10 pages idx = max(0, idx - 10) elif inp.code == term.KEY_HOME: # top idx, offset = (0, 0) elif inp == 'G' or inp.code == term.KEY_END: # bottom idx, offset = (self.last_page, 0) elif inp == '\x0c': self.dirty = True return idx, offset def draw(self, writer, idx, offset): """ Draw the current page view to ``writer``. :param callable writer: callable writes to output stream, receiving unicode. :param int idx: current page index. :param int offset: scrolling region offset of current page. :returns: tuple of next (idx, offset). :rtype: (int, int) """ # as our screen can be resized while we're mid-calculation, # our self.dirty flag can become re-toggled; because we are # not re-flowing our pagination, we must begin over again. while self.dirty: self.draw_heading(writer) self.dirty = self.STATE_CLEAN (idx, offset), data = self.page_data(idx, offset) for txt in self.page_view(data): writer(txt) self.draw_status(writer, idx) flushout() return idx, offset def draw_heading(self, writer): """ Conditionally redraw screen when ``dirty`` attribute is valued REFRESH. When Pager attribute ``dirty`` is ``STATE_REFRESH``, cursor is moved to (0,0), screen is cleared, and heading is displayed. :param callable writer: callable writes to output stream, receiving unicode. :return: True if class attribute ``dirty`` is ``STATE_REFRESH``. :rtype: bool """ if self.dirty == self.STATE_REFRESH: writer(''.join( (self.term.home, self.term.clear, self.screen.msg_intro(version=self.unicode_version), '\n', self.screen.header, '\n',))) return True return False def draw_status(self, writer, idx): """ Conditionally draw status bar when output terminal is a tty. :param callable writer: callable writes to output stream, receiving unicode. :param int idx: current page position index. :type idx: int """ if self.term.is_a_tty: writer(self.term.hide_cursor()) style = self.screen.style writer(self.term.move(self.term.height - 1)) if idx == self.last_page: last_end = '(END)' else: last_end = '/{0}'.format(self.last_page) txt = ('Page {idx}{last_end} - ' '{q} to quit, [keys: {keyset}]' .format(idx=style.attr_minor('{0}'.format(idx)), last_end=style.attr_major(last_end), keyset=style.attr_major('kjfbvc12-='), q=style.attr_minor('q'))) writer(self.term.center(txt).rstrip()) def page_view(self, data): """ Generator yields text to be displayed for the current unicode pageview. :param list[(unicode, unicode)] data: The current page's data as tuple of ``(ucs, name)``. :returns: generator for full-page text for display """ if self.term.is_a_tty: yield self.term.move(self.screen.row_begins, 0) # sequence clears to end-of-line clear_eol = self.term.clear_eol # sequence clears to end-of-screen clear_eos = self.term.clear_eos # track our current column and row, where column is # the whole segment of unicode value text, and draw # only self.screen.num_columns before end-of-line. # # use clear_eol at end of each row to erase over any # "ghosted" text, and clear_eos at end of screen to # clear the same, especially for the final page which # is often short. col = 0 for ucs, name in data: val = self.text_entry(ucs, name) col += 1 if col == self.screen.num_columns: col = 0 if self.term.is_a_tty: val = ''.join((val, clear_eol, '\n')) else: val = ''.join((val.rstrip(), '\n')) yield val if self.term.is_a_tty: yield ''.join((clear_eol, '\n', clear_eos)) def text_entry(self, ucs, name): """ Display a single column segment row describing ``(ucs, name)``. :param str ucs: target unicode point character string. :param str name: name of unicode point. :return: formatted text for display. :rtype: unicode """ style = self.screen.style if len(name) > style.name_len: idx = max(0, style.name_len - len(style.continuation)) name = ''.join((name[:idx], style.continuation if idx else '')) if style.alignment == 'right': fmt = ' '.join(('0x{val:0>{ucs_printlen}x}', '{name:<{name_len}s}', '{delimiter}{ucs}{delimiter}' )) else: fmt = ' '.join(('{delimiter}{ucs}{delimiter}', '0x{val:0>{ucs_printlen}x}', '{name:<{name_len}s}')) delimiter = style.attr_minor(style.delimiter) if len(ucs) != 1: # determine display of combining characters val = ord(ucs[1]) # a combining character displayed of any fg color # will reset the foreground character of the cell # combined with (iTerm2, OSX). disp_ucs = style.attr_major(ucs[0:2]) if len(ucs) > 2: disp_ucs += ucs[2] else: # non-combining val = ord(ucs) disp_ucs = style.attr_major(ucs) return fmt.format(name_len=style.name_len, ucs_printlen=UCS_PRINTLEN, delimiter=delimiter, name=name, ucs=disp_ucs, val=val) def validate_args(opts): """Validate and return options provided by docopt parsing.""" if opts['--wide'] is None: opts['--wide'] = 2 else: assert opts['--wide'] in ("1", "2"), opts['--wide'] if opts['--alignment'] is None: opts['--alignment'] = 'left' else: assert opts['--alignment'] in ('left', 'right'), opts['--alignment'] opts['--wide'] = int(opts['--wide']) opts['character_factory'] = WcWideCharacterGenerator if opts['--combining']: opts['character_factory'] = WcCombinedCharacterGenerator return opts def main(opts): """Program entry point.""" term = blessed.Terminal() style = Style() # if the terminal supports colors, use a Style instance with some # standout colors (magenta, cyan). if term.number_of_colors: style = Style(attr_major=term.magenta, attr_minor=term.bright_cyan, alignment=opts['--alignment']) style.name_len = 10 screen = Screen(term, style, wide=opts['--wide']) pager = Pager(term, screen, opts['character_factory']) with term.location(), term.cbreak(), \ term.fullscreen(), term.hidden_cursor(): pager.run(writer=echo, reader=term.inkey) return 0 if __name__ == '__main__': sys.exit(main(validate_args(docopt.docopt(__doc__)))) wcwidth-0.2.5/tests/0000755000175000017500000000000013674424426013047 5ustar zigozigowcwidth-0.2.5/tests/__init__.py0000644000175000017500000000005213674424426015155 0ustar zigozigo"""This file intentionally left blank.""" wcwidth-0.2.5/tests/test_core.py0000755000175000017500000001012113674424426015406 0ustar zigozigo# coding: utf-8 """Core tests for wcwidth module.""" # 3rd party import pkg_resources # local import wcwidth def test_package_version(): """wcwidth.__version__ is expected value.""" # given, expected = pkg_resources.get_distribution('wcwidth').version # exercise, result = wcwidth.__version__ # verify. assert result == expected def test_hello_jp(): u""" Width of Japanese phrase: コンニチハ, セカイ! Given a phrase of 5 and 3 Katakana ideographs, joined with 3 English-ASCII punctuation characters, totaling 11, this phrase consumes 19 cells of a terminal emulator. """ # given, phrase = u'コンニチハ, セカイ!' expect_length_each = (2, 2, 2, 2, 2, 1, 1, 2, 2, 2, 1) expect_length_phrase = sum(expect_length_each) # exercise, length_each = tuple(map(wcwidth.wcwidth, phrase)) length_phrase = wcwidth.wcswidth(phrase) # verify. assert length_each == expect_length_each assert length_phrase == expect_length_phrase def test_wcswidth_substr(): """ Test wcswidth() optional 2nd parameter, ``n``. ``n`` determines at which position of the string to stop counting length. """ # given, phrase = u'コンニチハ, セカイ!' end = 7 expect_length_each = (2, 2, 2, 2, 2, 1, 1,) expect_length_phrase = sum(expect_length_each) # exercise, length_phrase = wcwidth.wcswidth(phrase, end) # verify. assert length_phrase == expect_length_phrase def test_null_width_0(): """NULL (0) reports width 0.""" # given, phrase = u'abc\x00def' expect_length_each = (1, 1, 1, 0, 1, 1, 1) expect_length_phrase = sum(expect_length_each) # exercise, length_each = tuple(map(wcwidth.wcwidth, phrase)) length_phrase = wcwidth.wcswidth(phrase, len(phrase)) # verify. assert length_each == expect_length_each assert length_phrase == expect_length_phrase def test_control_c0_width_negative_1(): """CSI (Control sequence initiate) reports width -1 for ESC.""" # given, phrase = u'\x1b[0m' expect_length_each = (-1, 1, 1, 1) expect_length_phrase = -1 # exercise, length_each = tuple(map(wcwidth.wcwidth, phrase)) length_phrase = wcwidth.wcswidth(phrase, len(phrase)) # verify. assert length_each == expect_length_each assert length_phrase == expect_length_phrase def test_combining_width(): """Simple test combining reports total width of 4.""" # given, phrase = u'--\u05bf--' expect_length_each = (1, 1, 0, 1, 1) expect_length_phrase = 4 # exercise, length_each = tuple(map(wcwidth.wcwidth, phrase)) length_phrase = wcwidth.wcswidth(phrase, len(phrase)) # verify. assert length_each == expect_length_each assert length_phrase == expect_length_phrase def test_combining_cafe(): u"""Phrase cafe + COMBINING ACUTE ACCENT is café of length 4.""" phrase = u"cafe\u0301" expect_length_each = (1, 1, 1, 1, 0) expect_length_phrase = 4 # exercise, length_each = tuple(map(wcwidth.wcwidth, phrase)) length_phrase = wcwidth.wcswidth(phrase, len(phrase)) # verify. assert length_each == expect_length_each assert length_phrase == expect_length_phrase def test_combining_enclosing(): u"""CYRILLIC CAPITAL LETTER A + COMBINING CYRILLIC HUNDRED THOUSANDS SIGN is А҈ of length 1.""" phrase = u"\u0410\u0488" expect_length_each = (1, 0) expect_length_phrase = 1 # exercise, length_each = tuple(map(wcwidth.wcwidth, phrase)) length_phrase = wcwidth.wcswidth(phrase, len(phrase)) # verify. assert length_each == expect_length_each assert length_phrase == expect_length_phrase def test_combining_spacing(): u"""Balinese kapal (ship) is ᬓᬨᬮ᭄ of length 4.""" phrase = u"\u1B13\u1B28\u1B2E\u1B44" expect_length_each = (1, 1, 1, 1) expect_length_phrase = 4 # exercise, length_each = tuple(map(wcwidth.wcwidth, phrase)) length_phrase = wcwidth.wcswidth(phrase, len(phrase)) # verify. assert length_each == expect_length_each assert length_phrase == expect_length_phrase wcwidth-0.2.5/tests/test_ucslevel.py0000644000175000017500000001026513674424426016306 0ustar zigozigo# coding: utf-8 """Unicode version level tests for wcwidth.""" # std imports import json import warnings # 3rd party import pytest import pkg_resources # local import wcwidth def test_latest(): """wcwidth._wcmatch_version('latest') returns tail item.""" # given, expected = wcwidth.list_versions()[-1] # exercise, result = wcwidth._wcmatch_version('latest') # verify. assert result == expected def test_exact_410_str(): """wcwidth._wcmatch_version('4.1.0') returns equal value (str).""" # given, given = expected = '4.1.0' # exercise, result = wcwidth._wcmatch_version(given) # verify. assert result == expected def test_exact_410_unicode(): """wcwidth._wcmatch_version(u'4.1.0') returns equal value (unicode).""" # given, given = expected = u'4.1.0' # exercise, result = wcwidth._wcmatch_version(given) # verify. assert result == expected def test_nearest_505_str(): """wcwidth._wcmatch_version('5.0.5') returns nearest '5.0.0'. (str)""" # given given, expected = '5.0.5', '5.0.0' # exercise result = wcwidth._wcmatch_version(given) # verify. assert result == expected def test_nearest_505_unicode(): """wcwidth._wcmatch_version(u'5.0.5') returns nearest u'5.0.0'. (unicode)""" # given given, expected = u'5.0.5', u'5.0.0' # exercise result = wcwidth._wcmatch_version(given) # verify. assert result == expected def test_nearest_lowint40_str(): """wcwidth._wcmatch_version('4.0') returns nearest '4.1.0'.""" # given given, expected = '4.0', '4.1.0' warnings.resetwarnings() wcwidth._wcmatch_version.cache_clear() # exercise with pytest.warns(UserWarning): # warns that given version is lower than any available result = wcwidth._wcmatch_version(given) # verify. assert result == expected def test_nearest_lowint40_unicode(): """wcwidth._wcmatch_version(u'4.0') returns nearest u'4.1.0'.""" # given given, expected = u'4.0', u'4.1.0' warnings.resetwarnings() wcwidth._wcmatch_version.cache_clear() # exercise with pytest.warns(UserWarning): # warns that given version is lower than any available result = wcwidth._wcmatch_version(given) # verify. assert result == expected def test_nearest_800_str(): """wcwidth._wcmatch_version('8') returns nearest '8.0.0'.""" # given given, expected = '8', '8.0.0' # exercise result = wcwidth._wcmatch_version(given) # verify. assert result == expected def test_nearest_800_unicode(): """wcwidth._wcmatch_version(u'8') returns nearest u'8.0.0'.""" # given given, expected = u'8', u'8.0.0' # exercise result = wcwidth._wcmatch_version(given) # verify. assert result == expected def test_nearest_999_str(): """wcwidth._wcmatch_version('999.0') returns nearest (latest).""" # given given, expected = '999.0', wcwidth.list_versions()[-1] # exercise result = wcwidth._wcmatch_version(given) # verify. assert result == expected def test_nearest_999_unicode(): """wcwidth._wcmatch_version(u'999.0') returns nearest (latest).""" # given given, expected = u'999.0', wcwidth.list_versions()[-1] # exercise result = wcwidth._wcmatch_version(given) # verify. assert result == expected def test_nonint_unicode(): """wcwidth._wcmatch_version(u'x.y.z') returns latest (unicode).""" # given given, expected = u'x.y.z', wcwidth.list_versions()[-1] warnings.resetwarnings() wcwidth._wcmatch_version.cache_clear() # exercise with pytest.warns(UserWarning): # warns that given version is not valid result = wcwidth._wcmatch_version(given) # verify. assert result == expected def test_nonint_str(): """wcwidth._wcmatch_version(u'x.y.z') returns latest (str).""" # given given, expected = 'x.y.z', wcwidth.list_versions()[-1] warnings.resetwarnings() wcwidth._wcmatch_version.cache_clear() # exercise with pytest.warns(UserWarning): # warns that given version is not valid result = wcwidth._wcmatch_version(given) # verify. assert result == expected wcwidth-0.2.5/setup.cfg0000644000175000017500000000007713674424426013532 0ustar zigozigo[bdist_wheel] universal = 1 [metadata] license_file = LICENSE wcwidth-0.2.5/tox.ini0000644000175000017500000001056013674424426013222 0ustar zigozigo[tox] envlist = update, compile, autopep8, docformatter, isort, pylint, flake8, flake8_tests, pydocstyle, docs, py26, py27, py34, py35, py36 skip_missing_interpreters = true [testenv] deps = pytest==4.6.10 pytest-cov==2.8.1 commands = {envpython} -m pytest --cov-config={toxinidir}/tox.ini {posargs:\ --strict --verbose \ --junit-xml=.tox/results.{envname}.xml \ --durations=3 \ } \ --log-format='%(levelname)s %(relativeCreated)2.2f %(filename)s:%(lineno)d %(message)s' \ tests passenv = TEST_QUICK TEST_KEYBOARD TEST_RAW [isort] line_length = 100 indent = ' ' multi_line_output = 1 length_sort = 1 import_heading_stdlib = std imports import_heading_thirdparty = 3rd party import_heading_firstparty = local import_heading_localfolder = local sections=FUTURE,STDLIB,THIRDPARTY,FIRSTPARTY,LOCALFOLDER no_lines_before=LOCALFOLDER known_first_party = wcwidth known_third_party = codecov,docopt,blessed atomic = true [pytest] looponfailroots = wcwidth norecursedirs = .git .tox build addopts = --disable-pytest-warnings --cov-append --cov-report=html --color=yes --ignore=setup.py --ignore=.tox --cov=wcwidth filterwarnings = error junit_family = xunit1 [flake8] max-line-length = 100 exclude = .tox,build deps = flake8==3.8.2 [coverage:run] branch = True source = wcwidth parallel = True [coverage:report] omit = tests/* exclude_lines = pragma: no cover precision = 1 [coverage:paths] source = wcwidth/ [testenv:compile] basepython = python3.8 commands = python -m compileall {toxinidir}/wcwidth [testenv:update] usedevelop = true basepython = python3.8 deps = commands = python {toxinidir}/bin/update-tables.py python -mcompileall {toxinidir}/wcwidth/table_zero.py \ {toxinidir}/wcwidth/table_wide.py [testenv:autopep8] basepython = python3.8 deps = autopep8==1.4.4 commands = {envbindir}/autopep8 \ --in-place \ --recursive \ --aggressive \ --aggressive \ wcwidth/ bin/ tests/ setup.py [testenv:docformatter] deps = docformatter==1.3.1 untokenize==0.1.1 commands = {envbindir}/docformatter \ --in-place \ --recursive \ --pre-summary-newline \ --wrap-summaries=100 \ --wrap-descriptions=100 \ {toxinidir}/wcwidth \ {toxinidir}/bin \ {toxinidir}/setup.py \ {toxinidir}/docs/conf.py basepython = python3.8 [testenv:isort] deps = {[testenv]deps} -r docs/requirements.txt isort==4.3.21 commands = {envbindir}/isort --quiet --apply --recursive basepython = python3.8 [testenv:pylint] deps = pylint==2.5.2 commands = {envbindir}/pylint --rcfile={toxinidir}/.pylintrc \ --ignore=tests,docs,setup.py,conf.py,build,distutils,.pyenv,.git,.tox \ {posargs:{toxinidir}}/wcwidth [testenv:flake8] deps = {[flake8]deps} commands = {envbindir}/flake8 --ignore=F401,W503,W504 --exclude=tests setup.py docs/ wcwidth/ bin/ [testenv:flake8_tests] deps = {[flake8]deps} commands = {envbindir}/flake8 --ignore=W503,W504,F811,F401 tests/ bin/ [testenv:pydocstyle] deps = pydocstyle==5.0.2 restructuredtext_lint==1.3.0 doc8==0.8.0 pygments commands = {envbindir}/pydocstyle --source --explain {toxinidir}/blessed {envbindir}/rst-lint README.rst {envbindir}/doc8 --ignore-path docs/_build --ignore D000 docs [testenv:check] deps = -rrequirements-develop.txt usedevelop = true commands = prospector {posargs:--no-autodetect --die-on-tool-error} basepython = python3.8 [testenv:docs] deps = sphinx commands = sphinx-build docs/ build/sphinx [testenv:sphinx] deps = -r {toxinidir}/docs/requirements.txt commands = {envbindir}/sphinx-build {posargs:-v -W -d {toxinidir}/docs/_build/doctrees -b html docs {toxinidir}/docs/_build/html} [testenv:linkcheck] deps = -r {toxinidir}/docs/requirements.txt commands = {envbindir}/sphinx-build -v -W -d {toxinidir}/docs/_build/doctrees -b linkcheck docs docs/_build/linkcheck [testenv:codecov] basepython = python{env:TOXPYTHON:{env:TRAVIS_PYTHON_VERSION:3.8}} passenv = TOXENV CI TRAVIS TRAVIS_* CODECOV_* deps = codecov>=1.4.0 tenacity==6.1.0 # commands = codecov -e TOXENV # Workaround for https://github.com/codecov/codecov-python/issues/158 commands = {envpython} bin/run_codecov.py -e TOXENV [testenv:develop] deps = -rrequirements-develop.txt commands = {posargs} wcwidth-0.2.5/LICENSE0000644000175000017500000000245213674424426012715 0ustar zigozigoThe MIT License (MIT) Copyright (c) 2014 Jeff Quast Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. Markus Kuhn -- 2007-05-26 (Unicode 5.0) Permission to use, copy, modify, and distribute this software for any purpose and without fee is hereby granted. The author disclaims all warranties with regard to this software. wcwidth-0.2.5/MANIFEST.in0000644000175000017500000000006213674424426013441 0ustar zigozigoinclude LICENSE *.rst recursive-include tests *.pywcwidth-0.2.5/.gitignore0000644000175000017500000000017413674424426013677 0ustar zigozigo__pycache__ .coverage .cache .tox *.egg-info *.egg *.pyc *.swp build dist docs/_build htmlcov .coveralls.yml data .DS_Store wcwidth-0.2.5/.travis.yml0000644000175000017500000000144113674424426014016 0ustar zigozigolanguage: python matrix: fast_finish: true include: - python: 3.8 env: TOXENV=update,compile,autopep8,docformatter,isort,pylint,flake8,flake8_tests,pydocstyle,docs COVERAGE_ID=travis-ci - python: 2.7 env: TOXENV=py27,codecov COVERAGE_ID=travis-ci - python: 3.4 env: TOXENV=py34,codecov COVERAGE_ID=travis-ci - python: 3.5 env: TOXENV=py35,codecov COVERAGE_ID=travis-ci - python: 3.6 env: TOXENV=py36,codecov COVERAGE_ID=travis-ci - python: 3.7 env: TOXENV=py37,codecov COVERAGE_ID=travis-ci - python: 3.8 env: TOXENV=py38,codecov COVERAGE_ID=travis-ci install: - pip install tox script: - tox sudo: false notifications: email: recipients: - contact@jeffquast.com on_success: change on_failure: change