воскресенье, 9 февраля 2014 г.

Устанавливаем PyTables и получаем проблемы с Pandas и NumPy... Вся сборка под угрозой... Что делать?

Универсальный способ собрать стек - переустановить последние версии. После чтения постов про компиляцию и перекомпиляцию... было найдено универсальное средство - неофициальная библиотека экзешников для установки дополнений. Поскольку на эти грабли я наступил второй раз (использовал эту библиотеку и забыл про это), то публикую этот пост в назидание самому себе.
Вот здесь волшебная палочка Unofficial Windows Binaries for Python Extension Packages
А здесь нет бинарника для amd64 Numerical Python
Кроме Анаконды есть еще с десяток Scientific Python distributions
NumPy is a fundamental package needed for scientific computing with Python. Note: these builds are not completely compatible with the official SciPy binaries. Note: the MKL builds are linked statically to Intel's high performance Math Kernel Library Note: the unoptimized builds are less tested and not recommended for general use.
После великих откровений философии Python (предыдущий пост) я начинаю ценить сервисы прочих ползучих гадов. Оказалось, что в моей анаконде все, что нужно для PyTables уже установлено. А ссылку на PyTables я нагуглил после того, как сформулировал запрос "Python Tables "
Там, кстати, выскочила еще библиотека PrettyTable на code.google.com, но она показалась мне слишком простой...
In [5]:
from IPython.display import Image
Начал было читать и устанавливать все подряд в соответствии с инструкцией PyTables на github.com, но оказалось, что все уже стоит (в основном в C:/Users/kiss/Anaconda/Lib/site-packages)
Сначала эта команда прошла, и тест внизу показал, что все должно работать..., но далее выскочила ошибка с pandas, хотя раньше я из нее импортировал что-то для прорисовки таблиц... Я попытался обновить pandas... и после этого все и началось...
In [9]:
import tables
ValueError                                Traceback (most recent call last)
<ipython-input-9-389ecae14f10> in <module>()
----> 1 import tables

C:\Users\kiss\Anaconda\lib\site-packages\tables\__init__.py in <module>()
     81 # Necessary imports to get versions stored on the cython extension
---> 82 from tables.utilsextension import (get_pytables_version, get_hdf5_version,
     83     getPyTablesVersion, getHDF5Version)  # Pending Deprecation!

C:\Users\kiss\Documents\IPython Notebooks\web\oboobs\numpy.pxd in init tables.utilsextension (tables\utilsextension.c:14102)()

ValueError: numpy.dtype has the wrong size, try recompiling
In [2]:
PyTables version:  3.0.0
HDF5 version:      1.8.11
NumPy version:     1.7.1
Numexpr version:   2.2.2 (not using Intel's VML/MKL)
Zlib version:      1.2.3 (in Python interpreter)
LZO version:       2.06 (Aug 12 2011)
BZIP2 version:     1.0.6 (6-Sept-2010)
Blosc version:     1.2.3 (2013-05-17)
Cython version:    0.19.2
Python version:    2.7.5 |Anaconda 1.8.0 (64-bit)| (default, Jul  1 2013, 12:37:52) [MSC v.1500 64 bit (AMD64)]
Byte-ordering:     little
Detected cores:    2
Default encoding:  ascii
Performing only a light (yet comprehensive) subset of the test suite.
If you want a more complete test, try passing the --heavy flag to this script
(or set the 'heavy' parameter in case you are using tables.test() call).
The whole suite will take more than 4 hours to complete on a relatively
modern CPU and around 512 MB of main memory.

In [1]:
import pandas
ValueError                                Traceback (most recent call last)
<ipython-input-1-d6ac987968b6> in <module>()
----> 1 import pandas

C:\Users\kiss\Anaconda\lib\site-packages\pandas\__init__.py in <module>()
      5 try:
----> 6     from . import hashtable, tslib, lib
      7 except Exception:  # pragma: no cover
      8     import sys

C:\Users\kiss\Documents\IPython Notebooks\web\oboobs\numpy.pxd in init pandas.hashtable (pandas\hashtable.c:19547)()

ValueError: numpy.dtype has the wrong size, try recompiling
numpy.dtype has the wrong size, try recompiling

Я смотрел видео и параллельно хотел обновить pandas из консоли wiindows... А там "само вызвалось" обновление Numpy
In []:
 pip install pandas --upgrade
In [4]:
NameError                                 Traceback (most recent call last)
<ipython-input-4-f21b25ab6e8f> in <module>()
----> 1 s=Series([1,2,3,4,5],index=['a','b','c','d','e'])

NameError: name 'Series' is not defined
In [3]:
import numpy
In [4]:
In [7]:
In [6]:
In [8]:
NameError                                 Traceback (most recent call last)
<ipython-input-8-fdea071a6cc9> in <module>()
----> 1 tables.test()

NameError: name 'tables' is not defined
In [10]:
!pip help install
  pip install [options] <requirement specifier> ...
  pip install [options] -r <requirements file> ...
  pip install [options] [-e] <vcs project url> ...
  pip install [options] [-e] <local project path> ...
  pip install [options] <archive url/path> ...

  Install packages from:
  - PyPI (and other indexes) using requirement specifiers.
  - VCS project urls.
  - Local project directories.
  - Local or remote source archives.
  pip also supports installing from "requirements files", which provide
  an easy way to specify a whole environment to be installed.
  See http://www.pip-installer.org for details on VCS url formats and
  requirements files.

Install Options:
  -e, --editable <path/url>   Install a project in editable mode (i.e.
                              setuptools "develop mode") from a local project
                              path or a VCS url.
  -r, --requirement <file>    Install from the given requirements file. This
                              option can be used multiple times.
  -b, --build <dir>           Directory to unpack packages into and build in.
                              The default in a virtualenv is "<venv
                              path>/build". The default for global installs is
                              "<OS temp dir>/pip_build_<username>".
  -t, --target <dir>          Install packages into <dir>.
  -d, --download <dir>        Download packages into <dir> instead of
                              installing them, regardless of what's already
  --download-cache <dir>      Cache downloaded packages in <dir>.
  --src <dir>                 Directory to check out editable projects into.
                              The default in a virtualenv is "<venv
                              path>/src". The default for global installs is
                              "<current dir>/src".
  -U, --upgrade               Upgrade all packages to the newest available
                              version. This process is recursive regardless of
                              whether a dependency is already satisfied.
  --force-reinstall           When upgrading, reinstall all packages even if
                              they are already up-to-date.
  -I, --ignore-installed      Ignore the installed packages (reinstalling
  --no-deps                   Don't install package dependencies.
  --no-install                Download and unpack all packages, but don't
                              actually install them.
  --no-download               Don't download any packages, just install the
                              ones already downloaded (completes an install
                              run with --no-install).
  --install-option <options>  Extra arguments to be supplied to the setup.py
                              install command (use like --install-option
                              ="--install-scripts=/usr/local/bin"). Use
                              multiple --install-option options to pass
                              multiple options to setup.py install. If you are
                              using an option with a directory path, be sure
                              to use absolute path.
  --global-option <options>   Extra global options to be supplied to the
                              setup.py call before the install command.
  --user                      Install using the user scheme.
  --egg                       Install as self contained egg file, like
                              easy_install does.
  --root <dir>                Install everything relative to this alternate
                              root directory.
  --use-wheel                 Find and prefer wheel archives when searching
                              indexes and find-links locations. Default to
                              accepting source archives.
  --pre                       Include pre-release and development versions. By
                              default, pip only finds stable versions.
  --no-clean                  Don't clean up build directories.

Package Index Options:
  -i, --index-url <url>       Base URL of Python Package Index (default
  --extra-index-url <url>     Extra URLs of package indexes to use in addition
                              to --index-url.
  --no-index                  Ignore package index (only looking at --find-
                              links URLs instead).
  -f, --find-links <url>      If a url or path to an html file, then parse for
                              links to archives. If a local path or file://
                              url that's a directory, then look for archives
                              in the directory listing.
  -M, --use-mirrors           Use the PyPI mirrors as a fallback in case the
                              main index is down.
  --mirrors <url>             Specific mirror URLs to query when --use-mirrors
                              is used.
  --allow-external <package>  Allow the installation of externally hosted
  --allow-all-external        Allow the installation of all externally hosted
  --no-allow-external         Disallow the installation of all externally
                              hosted files
  --allow-insecure <package>  Allow the installation of insecure and
                              unverifiable files
  --no-allow-insecure         Disallow the installation of insecure and
                              unverifiable files

General Options:
  -h, --help                  Show help.
  -v, --verbose               Give more output. Option is additive, and can be
                              used up to 3 times.
  -V, --version               Show version and exit.
  -q, --quiet                 Give less output.
  --log <file>                Log file where a complete (maximum verbosity)
                              record will be kept.
  --proxy <proxy>             Specify a proxy in the form
  --timeout <sec>             Set the socket timeout (default 15 seconds).
  --exists-action <action>    Default action when a path already exists:
                              (s)witch, (i)gnore, (w)ipe, (b)ackup.
  --cert <path>               Path to alternate CA bundle.

In [2]:
import tables
In [3]:
import numpy
In [4]:
In [*]:
PyTables version:  3.0.0
HDF5 version:      1.8.11
NumPy version:     1.8.0
Numexpr version:   2.2.2 (not using Intel's VML/MKL)
Zlib version:      1.2.3 (in Python interpreter)
LZO version:       2.06 (Aug 12 2011)
BZIP2 version:     1.0.6 (6-Sept-2010)
Blosc version:     1.2.3 (2013-05-17)
Cython version:    0.19.2
C:\Users\kiss\Anaconda\lib\site-packages\tables\conditions.py:447: DeprecationWarning: using `oa_ndim == 0` when `op_axes` is NULL is deprecated. Use `oa_ndim == -1` or the MultiNew iterator for NumPy <1.8 compatibility
  return func(*args)
C:\Users\kiss\Anaconda\lib\site-packages\tables\tests\test_array.py:2537: DeprecationWarning: using a boolean instead of an integer will result in an error in the future
  a = nparr[key]
C:\Users\kiss\Anaconda\lib\site-packages\tables\tests\test_array.py:2585: DeprecationWarning: using a boolean instead of an integer will result in an error in the future
  s = nparr[key]
C:\Users\kiss\Anaconda\lib\site-packages\tables\tests\test_array.py:2586: DeprecationWarning: using a boolean instead of an integer will result in an error in the future
  nparr[key] = s * 2
C:\Users\kiss\Anaconda\lib\site-packages\tables\tests\test_array.py:2605: DeprecationWarning: using a boolean instead of an integer will result in an error in the future
  nparr[key] = 2   # broadcast value
C:\Users\kiss\Anaconda\lib\site-packages\numexpr\necompiler.py:742: DeprecationWarning: using `oa_ndim == 0` when `op_axes` is NULL is deprecated. Use `oa_ndim == -1` or the MultiNew iterator for NumPy <1.8 compatibility
  return compiled_ex(*arguments, **kwargs)

А это после переустановки еще и Numexpr и Pandas (до кучи)
In [*]:
PyTables version:  3.0.0
HDF5 version:      1.8.11
NumPy version:     1.8.0
Numexpr version:   2.3 (using VML/MKL 11.1.1)
Zlib version:      1.2.3 (in Python interpreter)
LZO version:       2.06 (Aug 12 2011)
BZIP2 version:     1.0.6 (6-Sept-2010)
Blosc version:     1.2.3 (2013-05-17)
Cython version:    0.19.2
C:\Users\kiss\Anaconda\lib\site-packages\tables\tests\test_array.py:2537: DeprecationWarning: using a boolean instead of an integer will result in an error in the future
  a = nparr[key]
C:\Users\kiss\Anaconda\lib\site-packages\tables\tests\test_array.py:2585: DeprecationWarning: using a boolean instead of an integer will result in an error in the future
  s = nparr[key]
C:\Users\kiss\Anaconda\lib\site-packages\tables\tests\test_array.py:2586: DeprecationWarning: using a boolean instead of an integer will result in an error in the future
  nparr[key] = s * 2
C:\Users\kiss\Anaconda\lib\site-packages\tables\tests\test_array.py:2605: DeprecationWarning: using a boolean instead of an integer will result in an error in the future
  nparr[key] = 2   # broadcast value

now worries, it's just a warning. you upgraded numpy version, as this is numpy 1.8.0 deprecation warning. Explained in this pull request. Continuation in this PR.
In [1]:
import pandas
In [2]:
Help on package pandas:



    pandas - a powerful data analysis and manipulation library for Python
    See http://pandas.sourceforge.net for full documentation. Otherwise, see the
    docstrings of the various objects in the pandas namespace:

    compat (package)
    computation (package)
    core (package)
    io (package)
    rpy (package)
    sandbox (package)
    sparse (package)
    stats (package)
    tests (package)
    tools (package)
    tseries (package)
    util (package)


    NaT = NaT
    __docformat__ = 'restructuredtext'
    __version__ = '0.13.1'
    __warningregistry__ = {('numpy.ufunc size changed, may indicate binary...
    describe_option = <pandas.core.config.CallableDynamicDoc object>
    get_option = <pandas.core.config.CallableDynamicDoc object>
    options = <pandas.core.config.DictWrapper object>
    plot_params = {'xaxis.compat': False}
    reset_option = <pandas.core.config.CallableDynamicDoc object>
    set_option = <pandas.core.config.CallableDynamicDoc object>


