Changelog¶

0.4.4 (unreleased)¶

Added support for Python3.7 (StopIteration --> return) Pull Request #18 (thanks @andrewmfiorillo)
Fixed tests for Google translation examples
Updated tox/Travis-CI config files to include latest Python & pypy versions
Updated sphinx_rtd_theme to version 0.4.2 to fix rendering problems on RTD
Updated setup.py publish commands, Makefile & Manifest.in to new PyPI (using twine)

Removed dependency on NLTK, as it already is a TextBlob dependency
Temporary workaround for NLTK Issue #824 for tox/Travis-CI
(update 13/01/2015) NLTK Issue #824 fixed, workaround removed
Enabled pattern tagset conversion ('penn'|'universal'|'stts') for PatternTagger
Added tests for tagset conversion
Fixed test for Arabic translation example (Google translation has changed)
Added tests for lemmatizer
Bugfix: PatternAnalyzer no longer breaks on subsequent ocurrences of the same (word, tag) pairs on Python3 see comments to Pull Request #11
Bugfix/performance enhancement: Sentiment dictionary in PatternAnalyzer no longer reloaded for every sentence Pull Request #11 (thanks @Arttii)

Docs hosted on RTD
Removed dependency on nltk’s depricated PunktWordTokenizer and replaced it with TreebankWordTokenizer see nltk/nltk#746 (comment) for details

Improved PatternParserNPExtractor (less false positives in verb filter)
Made sure that all keyword arguments with default None are checked with is not None
Fixed shortcut to _pattern.de in vendorized library
Added Makefile to facilitate development process
Added docs and API reference

Fixed tokenization in PatternParser (if initialized manually, punctuation was not always separated from words)
Improved handling of empty strings (Issue #3) and of strings containing single punctuation marks (Issue #4) in PatternTagger and PatternParser
Added tests for empty strings and for strings containing single punctuation marks

sdist is non-functional as important files are missing due to a misconfiguration in MANIFEST.in - does not affect wheels
Major internal refactoring (but no backwards-incompatible API changes) with the aim of restoring complete compatibility to original pattern>=2.6 library on Python2
Separation of textblob and pattern code
On Python2 the vendorized version of pattern.text.de is only used if original is not installed (same as nltk)
Made pattern.de.pprint function and all parser keywords accessible to customise parser output
Access to complete pattern.text.de API on Python2 and Python3 from textblob_de.packages import pattern_de as pd
tox passed on all major platforms (Win/Linux/OSX)

Option: Include punctuation in tags/pos_tags properties (b = TextBlobDE(text, tagger=PatternTagger(include_punc=True)))
Added BlobberDE() class initialized with German models
TextBlobDE(), Sentence(), WordList() and Word() classes are now all initialized with German models
Restored complete API compatibility with textblob.tokenizers module of the main TextBlob library

Noun Phrase Extraction: PatternParserNPExtractor() extracts NPs from Parser output
Refactored the way TextBlobDE() passes on arguments and keyword arguments to individual tools
Backwards-incompatible: Deprecate parser_show_lemmata=True keyword in TextBlob(). Use parser=PatternParser(lemmata=True) instead.

vastly improved tokenization (NLTKPunktTokenizer and PatternTokenizer with tests)
consistent use of specified tokenizer for all tools
TextBlobDE with initialized default models for German
Parsing (PatternParser) plus test_parsers.py
EXPERIMENTAL implementation of Polarity detection (PatternAnalyzer)
first attempt at extracting German Polarity clues into de-sentiment.xml
tox tests passing for py26, py27, py33 and py34