fastai / fastai_old Goto Github PK

View Code? Open in Web Editor NEW

179.0 37.0 75.0 129.51 MB

OLD REPO - PLEASE USE fastai/fastai

License: Apache License 2.0

Jupyter Notebook 98.07% Python 1.91% Makefile 0.01% JavaScript 0.01% Smarty 0.01%

fastai_old's Introduction

OLD REPO - PLEASE USE fastai/fastai

fastai_old's People

Contributors

Stargazers

Watchers

fastai_old's Issues

Can we avoid PIL

Hi I see that PIL is still imported in the vision module.
Would be nice an easy to avoid it so that we only used opencv
What do you think ?

tqdm vertical white space

Is it just on my setup, or do you also get this strange vertical white space when tqdm is used? e.g. in dev_nb/001b_fit.ipynb

it looks like it spits out a whole new notebook entry for each output, so that causes a weird vertical white space (as if there are two \n\n in between).

    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(IntProgress(value=0, max=79), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "0 0.2706044764518738\n"
     ]
    },

Not using softmax with cross entropy loss in 001b_fit.ipynb

https://github.com/fastai/fastai_v1/blob/ec6fc340ea9d6077f04ea87e2971d5ad8c841f4a/dev_nb/001b_fit.ipynb#L273

In the section "Simplify nn.Sequential layers", the loss F.cross_entropy is applied directly to the flattened output, without a softmax layer. Shouldn't it be one of the following:

Apply softmax to the outputs and use cross entropy loss
Apply log softmax to the outputs and use NLL loss

P.S.: Thanks for the step-by-step explanation!

image size rounding problem in dev_nb/dogscats-test-aug.ipynb

dev_nb/dogscats-test-aug.ipynb fails at the end running:

%time for (x,y) in iter(data.trn_dl): pass

RuntimeError                              Traceback (most recent call last)
<timed exec> in <module>()

<ipython-input-9-69c6e2ffd7b7> in __iter__(self)
      5 
      6     def __iter__(self):
----> 7         for b in self.dl:
      8             x, y = b[0].to(self.device),b[1].to(self.device)
      9             x = (x - self.m[None,:,None,None]) / self.s[None,:,None,None]

~/anaconda3/envs/pytorch-dev/lib/python3.6/site-packages/torch/utils/data/dataloader.py in __next__(self)
    312         if self.num_workers == 0:  # same-process loading
    313             indices = next(self.sample_iter)  # may raise StopIteration
--> 314             batch = self.collate_fn([self.dataset[i] for i in indices])
    315             if self.pin_memory:
    316                 batch = pin_memory_batch(batch)

~/anaconda3/envs/pytorch-dev/lib/python3.6/site-packages/torch/utils/data/dataloader.py in default_collate(batch)
    185     elif isinstance(batch[0], collections.Sequence):
    186         transposed = zip(*batch)
--> 187         return [default_collate(samples) for samples in transposed]
    188 
    189     raise TypeError((error_msg.format(type(batch[0]))))

~/anaconda3/envs/pytorch-dev/lib/python3.6/site-packages/torch/utils/data/dataloader.py in <listcomp>(.0)
    185     elif isinstance(batch[0], collections.Sequence):
    186         transposed = zip(*batch)
--> 187         return [default_collate(samples) for samples in transposed]
    188 
    189     raise TypeError((error_msg.format(type(batch[0]))))

~/anaconda3/envs/pytorch-dev/lib/python3.6/site-packages/torch/utils/data/dataloader.py in default_collate(batch)
    162             storage = batch[0].storage()._new_shared(numel)
    163             out = batch[0].new(storage)
--> 164         return torch.stack(batch, 0, out=out)
    165     elif elem_type.__module__ == 'numpy' and elem_type.__name__ != 'str_' \
    166             and elem_type.__name__ != 'string_':

RuntimeError: invalid argument 0: Sizes of tensors must match except in dimension 0. 
Got 223 and 224 in dimension 2 at /opt/conda/conda-bld/pytorch_1532579245307/work/aten/src/TH/generic/THTensorMath.cpp:3616

It looks like 1-off boundary problem. The following fixes it, but perhaps it needs something more precise than chasing edges (and it must have worked for you when you committed that):

--- a/dev_nb/dogscats-test-aug.ipynb
+++ b/dev_nb/dogscats-test-aug.ipynb
@@ -494,7 +494,7 @@
     "        matrix = matrix[:2,:]\n",
     "        _, h, w = x.size()\n",
     "        ratio = min(h,w) / self.size\n",
-    "        img_size = torch.Size([1,3,int(h/ratio),int(w/ratio)])\n",
+    "        img_size = torch.Size([1,3,int(h/ratio)+1,int(w/ratio)+1])\n",
     "        coords = F.affine_grid(matrix[None], img_size)\n",
     "        a = random.randint(0, img_size[2]-self.size) if img_size[2] >= self.size else 0\n",
     "        b = random.randint(0, img_size[3]-self.size) if img_size[3] >= self.size else 0\n",

it's also used earlier in the notebook, but w/o problems at run time:

img_size = torch.Size([1,3,int(h/ratio),int(w/ratio)])

Typos in 007b_... qrnn;bool should be qrnn:bool

qrnn;bool should be qrnn:bool
srt should be str
missing comma in: learn.fit_one_cycle(10, 1e-3, moms=(0.8,0.7), wd=0.03, pct_start=0.25)

001b_fit: get_data() parameters

In the "Transformation"-section, global variables and the get_data() - arguments are intermixed:

train_tds = TfmDataset(train_ds, mnist2image)
valid_tds = TfmDataset(valid_ds, mnist2image)

def get_data(train_ds, valid_ds, bs):
    return (DataLoader(train_tds, bs,   shuffle=True),
            DataLoader(valid_tds, bs*2, shuffle=False))

train_dl,valid_dl = get_data(train_ds, valid_ds, bs)

The parameters train_ds and valid_ds are not used in the function body, but the global variables train_tds and valid_tds. The intended usage is either calling the function with train_tds, valid_tds, or applying the transformation TfmDataset within the function.

Same issue in the "CUDA"-section

@dataclass
class DeviceDataLoader():
    dl: DataLoader
    device: torch.device
    progress_func: Callable
        
    def __len__(self): return len(self.dl)
    def __iter__(self):
        self.gen = (to_device(self.device,o) for o in self.dl)
        if self.progress_func is not None:
            self.gen = self.progress_func(self.gen, total=len(self.dl), leave=False)
        return iter(self.gen)

    @classmethod
    def create(cls, *args, device=default_device, progress_func=tqdm, **kwargs):
        return cls(DataLoader(*args, **kwargs), device=device, progress_func=progress_func)

def get_data(train_ds, valid_ds, bs):
    return (DeviceDataLoader.create(train_tds, bs,   shuffle=True),
            DeviceDataLoader.create(valid_tds, bs*2, shuffle=False))

train_dl,valid_dl = get_data(train_tds, valid_tds, bs)

conda install -c pytorch pytorch-nightly suggests cpu version

hey,
I tried conda install -c pytorch pytorch-nightly and it suggested cpu version for installation.
Is this correct behaviour and one needs to explicitly install the gpu version or is something wrong??

Weights and bias not initialized in 'Basic model and training loop' in 001a_nn_basics.ipynb

https://github.com/fastai/fastai_v1/blob/ec6fc340ea9d6077f04ea87e2971d5ad8c841f4a/dev_nb/001a_nn_basics.ipynb#L229

Looks like this block is missing:

import math

weights = torch.randn(784,10)/math.sqrt(784)
weights.requires_grad_()
bias = torch.zeros(10, requires_grad=True)

fit_one_cycle parameters in wrong order in 007_wikitext_2

It is
fit_one_cycle(learn, 5e-3, 1, (0.8,0.7), wd=1.2e-6)
should be
fit_one_cycle(learn, cyc_len=1, max_lr=5e-3, moms=(0.8,0.7), wd=1.2e-6)

rules is not defined in 007b_imdb_classifier

----> 1 tokenizer = Tokenizer(rules=rules, special_cases=[BOS, FLD, UNK, PAD])
2 bs,bptt = 50,70
3 data = data_from_textcsv(LM_PATH, tokenizer, data_func=lm_data, max_vocab=60000, bs=bs, bptt=bptt)

NameError: name 'rules' is not defined

Solution: Replace rules by default_rules:
tokenizer = Tokenizer(rules=default_rules, special_cases=[BOS, FLD, UNK, PAD])

Issue with fastai installation

Hey,
I am trying to install fastai using PyPI but keep getting this issue..

  Could not find a version that satisfies the requirement torch>=0.4.9 (from fastai==1.0.0b3) (from versions: 0.1.2, 0.1.2.post1, 0.3.1, 0.4.0, 0.4.1)
No matching distribution found for torch>=0.4.9 (from fastai==1.0.0b3)

also if I try to go via dev installation I get same issue,

  Using cached https://files.pythonhosted.org/packages/62/08/09ced1bae24016d96a1ddad0e0516275f113d929adf1fbd4a30a88002d68/fastprogress-0.1.5-py3-none-any.whl
Requirement already satisfied: ipython in /home/bukharih/anaconda3/envs/fast_v_1/lib/python3.6/site-packages (from fastai==1.0.0b3) (6.5.0)
Requirement already satisfied: matplotlib in /home/bukharih/anaconda3/envs/fast_v_1/lib/python3.6/site-packages (from fastai==1.0.0b3) (2.2.3)
Requirement already satisfied: numpy>=1.12 in /home/bukharih/anaconda3/envs/fast_v_1/lib/python3.6/site-packages (from fastai==1.0.0b3) (1.15.1)
Requirement already satisfied: pandas in /home/bukharih/anaconda3/envs/fast_v_1/lib/python3.6/site-packages (from fastai==1.0.0b3) (0.23.4)
Requirement already satisfied: Pillow in /home/bukharih/anaconda3/envs/fast_v_1/lib/python3.6/site-packages (from fastai==1.0.0b3) (5.2.0)
Requirement already satisfied: scipy in /home/bukharih/anaconda3/envs/fast_v_1/lib/python3.6/site-packages (from fastai==1.0.0b3) (1.1.0)
Collecting spacy (from fastai==1.0.0b3)
  Using cached https://files.pythonhosted.org/packages/24/de/ac14cd453c98656d6738a5669f96a4ac7f668493d5e6b78227ac933c5fd4/spacy-2.0.12.tar.gz
Collecting torch>=0.4.9 (from fastai==1.0.0b3)
  Could not find a version that satisfies the requirement torch>=0.4.9 (from fastai==1.0.0b3) (from versions: 0.1.2, 0.1.2.post1, 0.3.1, 0.4.0, 0.4.1)
No matching distribution found for torch>=0.4.9 (from fastai==1.0.0b3)

nb_006b.py missing "from nb_006 import *"

The previous generation of the .py apparently was missing the corresponding export...

Keep-Alive-Actions

IOError error. Not a gzipped file

When I run the with gzip.open(), it gives me the error, could you fix it? Thanks

In 007b_ bs=bs instead of bs=50 in data_from_textcsv(.....)

data = data_from_textcsv(LM_PATH, Tokenizer(), data_func=lm_data, bs=50)

getting doc nbs to work

The docs/*ipynb notebooks aren't working out of the box. I needed to add:

import os, sys
path = os.path.realpath('.')
sys.path.append(path+ "/gen_doc")
from nbdoc import *

base = path+"/../" # root of fastai_v1
sys.path.append(base)
#print(sys.path)

# this is needed to get the sub-packages to load
!touch {base}"__init__.py"
!touch {base}"fastai_v1.py"
!ln -fs {base}"dev_nb" {base}"fastai_v1"
!touch {base}"fastai_v1/__init__.py"

perhaps this can be done behind the scenes?

I needed to get it working as I'm experimenting with a faster stripout process.

Also in docs/fastai_v1.nb_001b.ipynb:

show_doc_from_name('fastai_v1.nb_001b','DeviceDataLoader.progress_func')
Class DeviceDataLoader doesn't have a function named progress_func.

Installation for cpu only fails

Is it necessary to have the cupy library in the requirements? I cannot install this on my local machine, and get the error:
Exception: Your CUDA environment is invalid. Please check above error log.

channel_first not defined in 003_rotation.ipynb

I have not found channel_first defined either in this notebook or nb_002 and it is used in line 147:

Channel_first = Transform(TfmType.PIXEL, channel_first)

fastai / fastai_old Goto Github PK

fastai_old's Introduction

fastai_old's People

Contributors

Stargazers

Watchers

Forkers

fastai_old's Issues

Recommend Projects

Recommend Topics

Recommend Org