Name | Modified | Size | Downloads / Week |
---|---|---|---|
Parent folder | |||
Python 3.5 Support, Sampler Pipelining, Finer Control of Random State, New Corporate Sponsor source code.tar.gz | 2019-11-04 | 790.1 kB | |
Python 3.5 Support, Sampler Pipelining, Finer Control of Random State, New Corporate Sponsor source code.zip | 2019-11-04 | 861.1 kB | |
README.md | 2019-11-04 | 4.5 kB | |
Totals: 3 Items | 1.7 MB | 0 |
Major Updates
- Updated my README emoji game to be more ambiguous while maintaining fun and heartwarming vibe. 🐕
- Support for Python 3.5
- Extensive rewrite of README to focus on new users and building an NLP pipeline.
- Support for Pytorch 1.2
-
Added
torchnlp.random
for finer grain control of random state building on PyTorch'sfork_rng
. This module controls the random state oftorch
,numpy
andrandom
.:::python import random import numpy import torch
from torchnlp.random import fork_rng
with fork_rng(seed=123): # Ensure determinism print('Random:', random.randint(1, 231)) print('Numpy:', numpy.random.randint(1, 231)) print('Torch:', int(torch.randint(1, 2**31, (1,)))) - Refactored
torchnlp.samplers
enabling pipelining. For example::::python from torchnlp.samplers import DeterministicSampler from torchnlp.samplers import BalancedSampler
data = ['a', 'b', 'c'] + ['c'] * 100 sampler = BalancedSampler(data, num_samples=3) sampler = DeterministicSampler(sampler, random_seed=12) print([data[i] for i in sampler]) # ['c', 'b', 'a'] - Added
torchnlp.samplers.balanced_sampler
for balanced sampling extending Pytorch'sWeightedRandomSampler
. - Addedtorchnlp.samplers.deterministic_sampler
for deterministic sampling based ontorchnlp.random
. - Addedtorchnlp.samplers.distributed_batch_sampler
for distributed batch sampling. - Addedtorchnlp.samplers.oom_batch_sampler
to sample large batches first in order to force an out-of-memory error. - Addedtorchnlp.utils.lengths_to_mask
to help create masks from a batch of sequences. - Addedtorchnlp.utils.get_total_parameters
to measure the number of parameters in a model. - Addedtorchnlp.utils.get_tensors
to measure the size of an object in number of tensor elements. This is useful for dynamic batch sizing and fortorchnlp.samplers.oom_batch_sampler
.:::python from torchnlp.utils import get_tensors
random_object_ = tuple([{'t': torch.tensor([1, 2])}, torch.tensor([2, 3])]) tensors = get_tensors(random_object_) assert len(tensors) == 2 - Added a corporate sponsor to the library: https://wellsaidlabs.com/
Minor Updates
- Fixed
snli
example (https://github.com/PetrochukM/PyTorch-NLP/pull/84) - Updated
.gitignore
to support Python's virtual environments (https://github.com/PetrochukM/PyTorch-NLP/pull/84) - Removed
requests
andpandas
dependency. There are only two dependencies remaining. This is useful for production environments. (https://github.com/PetrochukM/PyTorch-NLP/pull/84) - Added
LazyLoader
to reduce dependency requirements. (https://github.com/PetrochukM/PyTorch-NLP/commit/4e84780a8a741d6a90f2752edc4502ab2cf89ecb) - Removed unused
torchnlp.datasets.Dataset
class in favor of basic Python dictionary lists andpandas
. (https://github.com/PetrochukM/PyTorch-NLP/pull/84) - Support for downloading
tar.gz
files and unpacking them faster. (https://github.com/PetrochukM/PyTorch-NLP/commit/eb61fee854576c8a57fd9a20ee03b6fcb89c493a) - Rename
itos
andstoi
toindex_to_token
andtoken_to_index
respectively. (https://github.com/PetrochukM/PyTorch-NLP/pull/84) - Fixed
batch_encode
,batch_decode
, andenforce_reversible
fortorchnlp.encoders.text
(https://github.com/PetrochukM/PyTorch-NLP/pull/69) - Fix
FastText
vector downloads (https://github.com/PetrochukM/PyTorch-NLP/pull/72) - Fixed documentation for
LockedDropout
(https://github.com/PetrochukM/PyTorch-NLP/pull/73) - Fixed bug in
weight_drop
(https://github.com/PetrochukM/PyTorch-NLP/pull/76) stack_and_pad_tensors
now returns a named tuple for readability (https://github.com/PetrochukM/PyTorch-NLP/pull/84)- Added
torchnlp.utils.split_list
in favor oftorchnlp.utils.resplit_datasets
. This is enabled by the modularity oftorchnlp.random
. (https://github.com/PetrochukM/PyTorch-NLP/pull/84) - Deprecated
torchnlp.utils.datasets_iterator
in favor of Pythonsitertools.chain
. (https://github.com/PetrochukM/PyTorch-NLP/pull/84) - Deprecated
torchnlp.utils.shuffle
in favor oftorchnlp.random
. (https://github.com/PetrochukM/PyTorch-NLP/pull/84) - Support encoding larger datasets following fixing this issue (https://github.com/PetrochukM/PyTorch-NLP/issues/85).
- Added
torchnlp.samplers.repeat_sampler
following up on this issue: https://github.com/pytorch/pytorch/issues/15849