pip
package. All code
changes and discussion should move to the Keras repository.
For users looking for a place to start preprocessing data, consult the preprocessing layers guide and refer to the data loading utilities API.
Utilities for working with image data, text data, and sequence data.
License: Other
pip
package. All code
changes and discussion should move to the Keras repository.
For users looking for a place to start preprocessing data, consult the preprocessing layers guide and refer to the data loading utilities API.
The fact that Python's hash
function is used as a default hashing function in hashing_trick
and one_hot
is confusing (see #9500 issue in Keras) and was discussed before (see #9635 issue in Keras). The hash
function in Python 3 is randomized, what means that the results obtained during different sessions are inconsistent, so using it in data processing pipeline would lead to inconsistent results.
While I understand that one_hot
exists for historical reasons, this does not seem to justify preserving function that gives inconsistent results. Even if this is backward compatible with previous versions of Keras, it is not backward compatible with itself since different runs of the function give different results.
The simple solution would be to use "md5" as a default hashing function in hashing_trick
and use one_hot
as an alias to one_hot
with hash_function='md5'
.
Alternatively, since md5 is clearly slower then hash
, a faster alternative can be used. Instead of md5, xxHash function can be used. From what I know, xxHash is faster then md5 (but sill, slower then hash
) while giving equal quality results. The function is implemented in xxhash package (ports for Python, R, C++ etc.).
I may provide PR for this, but first I'd be grateful for comments on this, as I don't want to waste time for PR that gets rejected.
2.2.2
1.10.1
3.6.6
10.13.6
I'm trying to generate augmentations of my training data with zca_whitening
and an ImageDataGenerator
. But when I try to fit the generator (which is mandatory when using zca_whitening
) the python process eats more and more memory (100Gb+) until it gets killed by the system.
This small example can cause the leak:
import numpy as np
from keras.preprocessing.image import ImageDataGenerator
def cause_leak():
idg = ImageDataGenerator(zca_whitening = True)
random_sample = np.random.random((1, 250, 250, 3))
idg.fit(random_sample)
cause_leak()
The terminal output only consists of a warning saying that featurewise_center
is overwritten when enabling zca_whitening
. I don't think this is related to the problem but who knows.
Does anybody know a workaround?
Since the split, Iterators are not Sequence objects, which make them seen as generators from fit_generator
.
Should we modify keras_preprocessing to use Sequence if possible or change the logic of *_generator
to not check the type but just validate that the methods are there?
I try to use Tokenizer to handle string input. "oov_token" param is given "" when Tokenizer was initializing. However, oov_token's corresponding index is more than num_words. This index can't be used directly in embedding_lookup by token index.
Another question is how to use predefined words with Tokenizer , such as .
Hello, firstly I would like to thank your for this new library which removes the burden from writing some repetitive code, specially regarding text, and let us focus on solving problems instead.
While reading text.py
, I spotted the following construction in two different places:
if variable not in some_dict:
some_dict[variable] = 1
else:
some_dict[variable] += 1
If some_dict = defaultdict(int)
, then this code could be replaced by the one-liner some_dict[variable] += 1
. Why not use it? According to the tests below it is even faster:
In [1]: from collections import defaultdict;
In [2]: simple_dict = dict()
In [3]: def fun(): z = defaultdict(int); z['shoe'] += 1;
# >> Inserting a new element
In [4]: %timeit if 'shoe' not in simple_dict: simple_dict['shoe'] = 1
# 56.6 ns ± 0.0336 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
In [5]: %timeit fun
# 36.1 ns ± 0.00178 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
# >> Fetching existing element
In [6]: %timeit simple_dict['shoe']
# 54.3 ns ± 0.012 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
In [7] ddict = defaultdict(int); ddict['shoe'] += 1
In [8]: %timeit ddict['shoe']
# 58.3 ns ± 0.0122 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
# >> Updating existing element:
In [9]: def fun_2():
...: if 'shoe' not in simple_dict: simple_dict['shoe'] = 1
...: else: simple_dict['shoe'] += 1
...:
In [10]: %timeit fun_2()
# 290 ns ± 0.171 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
In [11]: z = defaultdict(int)
In [12]: %timeit z['shoe'] += 1
# 122 ns ± 0.0166 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
Hello everybody,
I'm facing a wired issue with image generators while iterating over them (keras 2.2.0)
My validation set contains 7160 pictures. Then, I set my generator like this:
batch_size = 32
train_datagen = image.ImageDataGenerator()
train_generator = train_datagen.flow_from_directory("train/", target_size=(224, 224), batch_size=batch_size)
Up to here, everything looks normal: train_generator[0]
returns a tuple of 32 image arrays and 32 label arrays, as expected.
The strange things is that if I iterate with a for loop as follow
x_train = []
for x, y in train_generator:
x_train.append(preprocess_input(x))
it simply iterates forever! And as consequence the size of x_train
gets bigger and bigger!
I would expect instead exactly 224 iteration (7160 samples in 32 batches).
And indeed, if I ask train_generator[405]
I get a reasonable ValueError: Asked to retrieve element 405, but the Sequence has length 224
.
What's going on here?! Am I missing something about how ImageGenerators work?
I believe the current implementation provides an erroneous, unexpected behaviour if both the rescale
parameter is used (not None
or different than 0) and feature-wise normalization is applied (featurewise_center
, featurewise_std_normalization
, ZCA whitening).
The fit()
function computes the statistics from the original, un-rescaled inputs and these statistics are applied finally on the rescaled data. For instance, if the images are uint8
(in the range [0, 255]) the feature-wise mean may be, for instance, 128. Then, if rescale=1./255
the output images will be in the range [0, 1], but the original mean 128 will be subtracted.
import numpy as np
from keras.preprocessing.image import ImageDataGenerator
images = np.random.randint(low=0, high=255, size=(10, 32, 32, 3))
rescale = 1. / 255
imagedatagen = ImageDataGenerator(rescale=rescale,
featurewise_center=True)
imagedatagen.fit(images)
batchgen = imagedatagen.flow(images, batch_size=10)
batch = batchgen.next()
images = images.astype(float)
images *= rescale
mean = np.mean(images, axis=(0, 1, 2))
images -= mean
print('Data range should be (approximately): [{}, {}]. \n'
'Actual data range is: [{}, {}]'.format(np.min(images),
np.max(images),
np.min(batch),
np.max(batch)))
Looks like there is no built-in support in Tokenizer for Chinese text parsing. It can be built using Jieba package, just need some coding work.
Multiple fails when running tests, although Tokenizer definitely has a sequences_to_texts
attribute.
Keras version : 2.2.0
def test_sequences_to_texts():
texts = [
'The cat sat on the mat.',
'The dog sat on the log.',
'Dogs and cats living together.'
]
tokenizer = keras.preprocessing.text.Tokenizer(num_words=10,
oov_token='<unk>')
tokenizer.fit_on_texts(texts)
tokenized_text = tokenizer.texts_to_sequences(texts)
> trans_text = tokenizer.sequences_to_texts(tokenized_text)
E AttributeError: 'Tokenizer' object has no attribute 'sequences_to_texts'
I'm working with images organized across several folders. I have a dataframe of their file paths, and up until now I've been using that with a script to move them into the necessary categorical folders. It takes up a lot of time and space. So, needless to say I was ecstatic when I found the flow_from_dataframe() method.
I have my valid dataframe and filepaths. I initialize the generators like this:
from keras.preprocessing.image import ImageDataGenerator
main_dir = '/User/name/etc/etc/224px/'
# Initiate the train and test generators with data Augumentation
train_datagen = ImageDataGenerator(preprocessing_function = preprocess_input,
#rescale = 1./255,
horizontal_flip = True,
fill_mode = "nearest",
zoom_range = 0.3,
width_shift_range = 0.1,
height_shift_range = 0.1,
rotation_range = 30)
test_datagen = ImageDataGenerator(preprocessing_function = preprocess_input)
seed = 108201987 # Optional random seed for shuffling and transformations.
train_generator = train_datagen.flow_from_dataframe(dataframe=train,
directory=main_dir,
x_col='filepath',
y_col='label',
has_ext=True,
target_size = (img_height, img_width),
batch_size = batch_size,
class_mode = "binary",
seed = seed)
validation_generator = test_datagen.flow_from_directory(dataframe=val,
directory=main_dir,
x_col='filepath',
y_col='label',
has_ext=True,
target_size = (img_height, img_width),
class_mode = "binary",
seed = seed)
Here's a sample file path: 'October 29 2018/Top view_XYZ_4-9/IMG_6854.JPG'
Both the training and validation generators have has_ext set to True, since my files have extensions. However, I get this error:
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-11-1e162ab6ee70> in <module>()
23 batch_size = batch_size,
24 class_mode = "binary",
---> 25 seed = seed)
26
27 validation_generator = test_datagen.flow_from_directory(dataframe=val,
/usr/local/anaconda3/lib/python3.5/site-packages/keras_preprocessing/image.py in flow_from_dataframe(self, dataframe, directory, x_col, y_col, has_ext, target_size, color_mode, classes, class_mode, batch_size, shuffle, seed, save_to_dir, save_prefix, save_format, subset, interpolation)
1105 save_format=save_format,
1106 subset=subset,
-> 1107 interpolation=interpolation)
1108
1109 def standardize(self, x):
/usr/local/anaconda3/lib/python3.5/site-packages/keras_preprocessing/image.py in __init__(self, dataframe, directory, image_data_generator, x_col, y_col, has_ext, target_size, color_mode, classes, class_mode, batch_size, shuffle, seed, data_format, save_to_dir, save_prefix, save_format, follow_links, subset, interpolation, dtype)
2101 break
2102 if not ext_exist:
-> 2103 raise ValueError('has_ext is set to True but'
2104 ' extension not found in x_col')
2105 temp_df = pd.DataFrame({x_col: filenames}, dtype=str)
ValueError: has_ext is set to True but extension not found in x_col
I was so excited about the possibility of never having to sort or mix-up my images again. Any ideas?
In the Tokenizer
class, currently self.document_count
is initialized with the argument document_count
, but it should always be initialized with 0
. Allowing the user to initialize it with a non-zero value will result in incorrect results or errors in tf-idf mode. Furthermore, the document_count
argument is not documented.
hello,
I see an issue with TimeSeriesGenerator.
tensorflow 1.11.0
Keras 2.2.2
Keras-Applications 1.0.6
Keras-Preprocessing 1.0.5
I am using the following code to test the TimeseriesGenerator
data = np.arange(0,100).reshape(-1,1)
data_gen = TimeseriesGenerator(data, data, length=WINDOW_LENGTH,
sampling_rate=1, batch_size=1)
data_dim = 1
input1 = Input(shape=(WINDOW_LENGTH, data_dim))
lstm1 = LSTM(100)(input1)
hidden = Dense(20, activation='relu')(lstm1)
output = Dense(data_dim, activation='linear')(hidden)
model = Model(inputs=input1, outputs=output)
model.compile(loss='mse', optimizer='rmsprop', metrics=['accuracy'])
model.fit_generator(generator=data_gen,
steps_per_epoch=32,
epochs=10)
here is the stacktrace.
TypeErrorTraceback (most recent call last)
<ipython-input-55-ad7e35e8fffd> in <module>()
16 model.fit_generator(generator=data_gen,
17 steps_per_epoch=32,
---> 18 epochs=10)
/usr/lib/python2.7/site-packages/keras/legacy/interfaces.pyc in wrapper(*args, **kwargs)
/usr/lib/python2.7/site-packages/keras/engine/training.pyc in fit_generator(self, generator, steps_per_epoch, epochs, verbose, callbacks, validation_data, validation_steps, class_weight, max_queue_size, workers, use_multiprocessing, shuffle, initial_epoch)
/usr/lib/python2.7/site-packages/keras/engine/training_generator.pyc in fit_generator(model, generator, steps_per_epoch, epochs, verbose, callbacks, validation_data, validation_steps, class_weight, max_queue_size, workers, use_multiprocessing, shuffle, initial_epoch)
/usr/lib/python2.7/site-packages/keras/utils/data_utils.pyc in get(self)
/usr/lib/python2.7/site-packages/keras/utils/data_utils.pyc in _data_generator_task(self)
TypeError: TimeseriesGenerator object is not an iterator
I tried to play around with package versions and I see that issue occurs only when using Keras-Preprocessing >= 1.0.3. I am able to run this code with 1.0.2.
I'm trying to use flow_from_dataframe with directory=None to use absolute path as descirbed here
Keras==2.2.4
Keras-preprocessing==1.0.5
but this is what I get:
datagen.flow_from_dataframe(data, directory=None, x_col='fname', y_col='cat',has_ext=True)
...
TypeError Traceback (most recent call last)
<ipython-input-128-f0acbea298e4> in <module>()
----> 1 datagen.flow_from_dataframe(data, directory=None, batch_size=2, x_col='fname', y_col='cat',has_ext=True)
/usr/local/anaconda3/lib/python3.6/site-packages/keras_preprocessing/image.py in flow_from_dataframe(self, dataframe, directory, x_col, y_col, has_ext, target_size, color_mode, classes, class_mode, batch_size, shuffle, seed, save_to_dir, save_prefix, save_format, subset, interpolation)
1105 save_format=save_format,
1106 subset=subset,
-> 1107 interpolation=interpolation)
1108
1109 def standardize(self, x):
/usr/local/anaconda3/lib/python3.6/site-packages/keras_preprocessing/image.py in __init__(self, dataframe, directory, image_data_generator, x_col, y_col, has_ext, target_size, color_mode, classes, class_mode, batch_size, shuffle, seed, data_format, save_to_dir, save_prefix, save_format, follow_links, subset, interpolation, dtype)
2093 class_indices=self.class_indices,
2094 follow_links=follow_links,
-> 2095 df=True)
2096 if has_ext:
2097 ext_exist = False
/usr/local/anaconda3/lib/python3.6/site-packages/keras_preprocessing/image.py in _list_valid_filenames_in_directory(directory, white_list_formats, split, class_indices, follow_links, df)
1762 `["file1.jpg", "file2.jpg", ...]`).
1763 """
-> 1764 dirname = os.path.basename(directory)
1765 if split:
1766 num_files = len(list(
/usr/local/anaconda3/lib/python3.6/posixpath.py in basename(p)
144 def basename(p):
145 """Returns the final component of a pathname"""
--> 146 p = os.fspath(p)
147 sep = _get_sep(p)
148 i = p.rfind(sep) + 1
TypeError: expected str, bytes or os.PathLike object, not NoneType
There is something wrong with this link in the Image Preprocessing docs.:
keras-preprocessing/keras_preprocessing/image.py
Lines 886 to 888 in 45fc4a0
When I hover on it, I see at the bottom of my browser this URL:
https://gist.github.com/fchollet/ 0830affa1f7f19fd47b06d4cf89ed44d
and when I click on it, it leads me to
https://gist.github.com/fchollet/%20%20%20%20%20%20%20%200830affa1f7f19fd47b06d4cf89ed44d
I can reproduce on Chrome and Firefox.
I don't see this support on latest keras-preprocessing source codes.
I have training data that are RGBA images stored as png files. When I read them in as RGB images using flow_from_directory everything runs smoothly.
But if I set the 'color_mode' argument of flow_from_directory to 'rgba', as in he documentation, I get the following error when trying to run fit_generator:
`Epoch 1/120
Traceback (most recent call last):
File "training_keras.py", line 326, in <module>
train_model(MODEL_NAME,BASE_DIR,OUTPUT_DIR,GPUS,NUM_EPOCHS,BATCH_SIZE,WIDTH,HEIGHT,MODEL_TYPE,WORKERS,DATA_FRACTION,TRAIN_ALL,FIRST_LAYER,FCN_SIZE,VALIDATION_DIR)
File "training_keras.py", line 314, in train_model
model.fit_generator( train_generator,steps_per_epoch=(train_generator.n/(BATCH_SIZE)/DATA_FRACTION),epochs=NUM_EPOCHS,callbacks=cbks,workers=WORKERS, validation_data=validate_generator)
File "/usr/local/lib/python3.6/site-packages/keras/legacy/interfaces.py", line 91, in wrapper
return func(*args, **kwargs)
File "/usr/local/lib/python3.6/site-packages/keras/engine/training.py", line 1418, in fit_generator
initial_epoch=initial_epoch)
File "/usr/local/lib/python3.6/site-packages/keras/engine/training_generator.py", line 181, in fit_generator
generator_output = next(output_generator)
File "/usr/local/lib/python3.6/site-packages/keras/utils/data_utils.py", line 601, in get
six.reraise(*sys.exc_info())
File "/usr/local/lib/python3.6/site-packages/six.py", line 693, in reraise
raise value
File "/usr/local/lib/python3.6/site-packages/keras/utils/data_utils.py", line 595, in get
inputs = self.queue.get(block=True).get()
File "/usr/local/lib/python3.6/multiprocessing/pool.py", line 644, in get
raise self._value
File "/usr/local/lib/python3.6/multiprocessing/pool.py", line 119, in worker
result = (True, func(*args, **kwds))
File "/usr/local/lib/python3.6/site-packages/keras/utils/data_utils.py", line 401, in get_index
return _SHARED_SEQUENCES[uid][i]
File "/usr/local/lib/python3.6/site-packages/keras_preprocessing/image.py", line 1441, in __getitem__
return self._get_batches_of_transformed_samples(index_array)
File "/usr/local/lib/python3.6/site-packages/keras_preprocessing/image.py", line 1932, in _get_batches_of_transformed_samples
batch_x[i] = x
ValueError: could not broadcast input array from shape (296,296,3) into shape (296,296,4)
`
Why does this happen?
Thanks!
P.S.
Keras is version 2.2.4
Keras-Preprocessing is version 1.0.6
Tensorflow-GPU is version 1.10.1
Could you please tell me the right procedure of installation.Since the feature flow_from_dataframe is not availble in the pip version, i try to get it installed from the gitub repo. This is what i tried and failed in my virtual environment
pip uninstall keras
pip uninstall keras-preprocessing
pip install git+https://github.com/keras-team/keras-preprocessing.git
pip install keras
What am i doing wrong here?
The method apply_affine_transform
is a wrapper scipy.ndimage.interpolation.affine_transform
that has a parameter order
which is the order a a spline interpolation. Not being able to pass this parameter causes a problem when generating random transformation of labeled images since it results in non-integer values.
for example:
for a given labels (each integer is a class label)
[[2 2 0 2 2]
[1 3 2 3 1]
[2 1 0 1 2]
[3 1 0 2 0]
[3 1 3 2 1]]
it transformed to
[[2.289865 1.7110896 1.8507836 2.172145 0.15832195]
[3. 2.2037435 1.0774351 1.4194988 2.3764393 ]
[3. 1.4194988 0.39426237 0.49894607 1.7452569 ]
[1.8929222 1.7380519 1.7849773 2. 1.2062019 ]
[1.2646285 2.8956027 2.2367117 1.3144331 0.23671168]]
Which are clearly not valid class labels. I dag around and found that the reason is the hard coded order
in https://github.com/keras-team/keras-preprocessing/blob/master/keras_preprocessing/image.py line 323.
being able to pass this parameter would enable correct data augmentation for image segmentation.
I have made a small path to fix this issue so i'll create a PR soon.
Hi, while implementing object detection in Keras and using data augmentation i have been checking the results of the affine_transform and I got strange results. For instance for tx=0 and ty=24 I got a horizontal displacement to the left
You can check the data augmentation code in
https://github.com/RParedesPalacios/GILA/blob/development/src/detect_generators.py
line 68
Has anybody else checked this??
Thanks.
From: keras-team/keras#10768 by @hadaev8
Tokenizer will fit/transform the string into chars if a string is provided to fit_on_texts
/text_to_sequences
methods regardless of char_level setting. This is happening because the method expects a list of strings and is splitting the string into chars if just 1 string is given in this line for fitting:
and this one for trasnforming:
Reproducible code illustrating the problem with fit_on_texts:
from keras.preprocessing.text import Tokenizer
text='check check fail'
tokenizer = Tokenizer()
tokenizer.fit_on_texts(text)
tokenizer.word_index
Output:
{'c': 1, 'h': 2, 'e': 3, 'k': 4, 'f': 5, 'a': 6, 'i': 7, 'l': 8}
wrapping text into list solves the issue
tokenizer.fit_on_texts([text])
tokenizer.word_index
{'check': 1, 'fail': 2}
I can recommend checking that text is a list of strings and if it is not producing a warning and wrapping it into the list or erroring out
I found that I need to rewrite load_img
in image.py
with opencv for my corner case of 16-bit images. Also benchmarks show that python-opencv is faster than PIL. Will there be interest in incorporating opencv/cv2-based reading function as an alternative to PIL into this package?
Hello!
I am training a image classification model with multiple outputs:
trained_model = tf.keras.applications.xception.Xception(
include_top=False,
weights='imagenet',
input_shape=[300, 300, 3],
pooling='max')
outputs = []
for i in range(8):
outputs.append(tf.keras.layers.Dense(1, activation='softmax', kernel_initializer=kernel_initializer) (trained_model.output))
model = tf.keras.Model(inputs=trained_model.input, outputs=outputs)
The y
returned by this model is a Python List
, with 8 elements. Each element is a mini-batch of tensors.
However, flow_from_dataframe
reads all my y columns from the dataframe as one numpy array, instead of a Python list
.
Suppose my dataframe is something like this:
image_path,field_1,field_2,field_3,field_4,field_5,field_6,field_7,field_8
1532672467738.jpeg,1,1,0,1,0,0,0,1
1532669990747.jpeg,0,0,0,1,0,1,1,0
...
Then I call flow_from_dataframe:
train_batches = generator.flow_from_dataframe(
dataframe=dataframe,
directory=path,
x_col='image_path',
y_col=['field_1', 'field_2', 'field_3', 'field_4', 'field_5', 'field_6', 'field_7', 'field_8'],
class_mode='other',
batch_size=16
)
When I call fit_generator
with both the model and train_batches
, I get this error:
ValueError: Error when checking model target: the list of Numpy arrays that you are passing to your model is not the size the model expected. Expected to see 8 array(s), but instead got the following list of 1 arrays: [array([[0, 0, 0, 0, 1, 0, 1, 1],
[1, 1, 0, 1, 1, 0, 0, 0],
[0, 0, 1, 0, 1, 0, 0, 1],
[0, 0, 1, 1, 0, 0, 1, 0],
[0, 1, 0, 1, 0, 1, 0, 1],
[1, 0, 0, 1, 0, 1, 0, 0],
So, like I wrote in the beggining: DataframeIterator
sends a numpy array of shape (16, 8)
, and the model outputs a Python List
of 8 numpy arrays of size (16)
.
I think the problem is in this excerpt from keras_preprocessing/image.py
:
if self.class_mode == 'input':
batch_y = batch_x.copy()
elif self.class_mode == 'sparse':
batch_y = self.classes[index_array]
elif self.class_mode == 'binary':
batch_y = self.classes[index_array].astype(self.dtype)
elif self.class_mode == 'categorical':
batch_y = np.zeros(
(len(batch_x), self.num_classes),
dtype=self.dtype)
for i, label in enumerate(self.classes[index_array]):
batch_y[i, label] = 1.
elif self.class_mode == 'other':
batch_y = self.data[index_array]
else:
return batch_x
return batch_x, batch_y
The line batch_y = self.data[index_array]
returns a Numpy array.
from keras.preprocessing.text import Tokenizer
texts = ['a b c']
tokenizer = Tokenizer(num_words=2)
tokenizer.fit_on_texts(texts)
tokenizer.word_index
{'a': 1, 'b': 2, 'c': 3}
print(tokenizer.texts_to_sequences(texts))
[[1,]]
From a perspective from who only want to use the sequence.pad_sequences
module and don't want to do a shallow copy of it, I believe that a better aproach for this project would be to avoid any dependency, or just use numpy
.
We can do like all packages do for matplotlib
for example, displaying a warning to the user that a optional dependency are required for that process.
Packages to remove from dependencies:
The version recorded in the setup.py file for the 1.0.2 tag is 1.0.1, which is incorrect. The sdist on PyPI has the correct version in the setup.py file.
This can causes issues if a checkout of the git tag is used as the mean to install keras-preprocessing as the incorrect version will be recorded in the metadata.
I do not think anything should be done to fix this as changing a tag is a bad procedure. I think it is useful to have this in the issue tracker in case anyone else runs into the issue.
My images are trained with float representation, so that their maximum value is 1.0.
However, when I applied random_brightness
, the image is between 0.0 and 255.0
I think this is not expected, or at least, it should be warned in the documentation, shouldn't it?
I'm trying to augment both images and masks. Images are working propely but masks fail.
It happens when mask is zoomed out and rotated. That black and white should be blue, like background.
I've tried to change fill_mode, but it doesn't work for constant and nearest. Wrap works, but it creates red areas where it shouldn't.
Code:
def augGenerator():
gen = ImageDataGenerator(
rotation_range=20,
shear_range=0.2,
zoom_range=0.2,
horizontal_flip=True,
)
return gen
def augmentImage(img, mask, img_size, aug_count):
aug_images = [img]
aug_masks = [mask]
img = img.reshape(-1, img_size, img_size, 3)
mask = mask.reshape(-1, img_size, img_size, 3)
gen_img = augGenerator()
gen_mask = augGenerator()
seed = 1
gen_img.fit(img, augment=True, seed=seed)
gen_mask.fit(mask, augment=True, seed=seed)
img_aug_iter = gen_img.flow(img,seed=seed)
mask_aug_iter = gen_mask.flow(mask,seed=seed)
aug_images += [next(img_aug_iter)[0] for i in range(aug_count)]
aug_masks += [next(mask_aug_iter)[0] for i in range(aug_count)]
return aug_images, aug_masks
Hello again! I'm still struggling with flow_from_dataframe() after the issues I had here.
In order to use the new fixes, I cloned the keras repo, and then replaced the contents of the preprocessing folder with the latest from the keras-preprocessing repo. I renamed the local repo keras2 to avoid importing the vanilla repo. The code finally runs, but it's not finding any images.
Here's my script:
import pandas as pd
import numpy as np
import sys
sys.path.append('/Users/lmcane/documents/tools/keras2/')
from keras2.preprocessing.image import ImageDataGenerator
train = pd.read_csv('short_dir_train.csv', index_col=0)
print(train.filepath[0] + '\n')
train.info()
Returns:
Using TensorFlow backend.
March 29 2018/Top view_1-2/IMG_6823.JPG
<class 'pandas.core.frame.DataFrame'>
Int64Index: 869 entries, 0 to 868
Data columns (total 2 columns):
filepath 869 non-null object
label 869 non-null object
dtypes: object(2)
memory usage: 60.4+ KB
Then the main body of the script:
main_dir = '/Users/lmcane/Documents/Datasets/Unsorted Extracted/224x224px'
img_width, img_height = 224, 224
nb_train_samples = 433
nb_validation_samples = 216
batch_size = 20
epochs = 10
train_datagen = ImageDataGenerator(horizontal_flip = True,
fill_mode = "nearest",
zoom_range = 0.3,
width_shift_range = 0.1,
height_shift_range = 0.1,
rotation_range = 30)
train_generator = train_datagen.flow_from_dataframe(dataframe=train,
directory=main_dir,
x_col='filepath',
y_col='label',
has_ext=True,
target_size = (img_height, img_width),
batch_size = batch_size,
class_mode = "binary")
Returns:
Found 0 images belonging to 2 classes.
It should find 433. I suspect I didn't import the repo correctly?
Hello! While runnning keras-preprocessing(master)/image.py/random_channel_shift, I thought it was different from the expected channel_shift behavior.
I think that the expected channel shift movement is old.
from keras.datasets import cifar10
from keras.preprocessing.image import random_channel_shift
import numpy as np
import matplotlib.pyplot as plt
def plot_tiles(images, rows=5, columns=5):
pos = 1
for idx in range(rows*columns):
plt.subplot(rows, columns, pos)
img = images[idx]
plt.imshow(img)
plt.axis("off")
pos += 1
plt.show()
(x_train, y_train), (x_test, y_test) = cifar10.load_data()
sample_images = x_train[:9]/255
channel_shift_range = 0.3
plot_tiles(sample_images, rows=3, columns=3)
channel_shift_images_latest = []
for _ in sample_images:
channel_shift_images_latest.append(_random_channel_shift(_, channel_shift_range, 2))
channel_shift_images_latest = np.array(channel_shift_images_latest)
plot_tiles(channel_shift_images_latest, rows=3, columns=3)
keras-preprocessing(master)/image.py/random_channel_shift
channel_shift_images_old = []
for _ in sample_images:
channel_shift_images_old.append(random_channel_shift(_, channel_shift_range, 2))
channel_shift_images_old = np.array(channel_shift_images_old)
plot_tiles(channel_shift_images_old, rows=3, columns=3)
I am trying to use the flow_from_dataframe() function but run on a KeyError: nan error.
My csv file is as follow:
subDirectory_filePath, expression
img_1, 0
img_2, 3
...
img_n, 0
therefore the first argument is a string while the other one is a integer. I have tried to follow that tuorial:
Tutorial on Keras ImageDataGenerator with flow_from_dataframe
and thus my code is:
train_datagen = ImageDataGenerator(rescale=1. / 255,horizontal_flip=False)
df_train = pd.read_csv(data['csv_train_file'], dtype={'subDirectory_filePath': str, 'expression': int})
train_generator = train_datagen.flow_from_dataframe(
dataframe=df_train,
directory=data['img_dir'],
x_col='subDirectory_filePath',
y_col='expression',
has_ext=True,
class_mode="categorical",
target_size=(model_params['img_height'], model_params['img_width']),
batch_size=model_params['batch_size']
#save_to_dir='test_train'
)
and get the following issues:
Found 427298 images belonging to 11 classes.
Traceback (most recent call last):
File "train_model.py", line 250, in <module>
train_model(model_name=model_name, dataset=dataset, mode=mode, weights=weights, computer=computer, run=run)
File "train_model.py", line 198, in train_model
train_generator, validation_generator = get_csv_generator(data, model_params, da, extended=False)
File "train_model.py", line 121, in get_csv_generator
batch_size=model_params['batch_size']
File "/home/michael/.local/lib/python3.5/site-packages/keras_preprocessing/image.py", line 1108, in flow_from_dataframe
interpolation=interpolation)
File "/home/michael/.local/lib/python3.5/site-packages/keras_preprocessing/image.py", line 2168, in __init__
self.classes = np.array([self.class_indices[cls] for cls in classes])
File "/home/michael/.local/lib/python3.5/site-packages/keras_preprocessing/image.py", line 2168, in <listcomp>
self.classes = np.array([self.class_indices[cls] for cls in classes])
KeyError: nan
Furthermore, while printing my csv file I can see that it is full of NaN such as:
subDirectory_filePath expression
002276d73d5822544f39d86b45098e67f84f78cd8edcba8... NaN
01807ce4c37cc4463bd06a966a4043edc14864a0075ff78... NaN
I am guessing that my issue is something with loading my csv file, I have first tried without the dtype dictionary but the same error occur.
Any help much appreciated
If you inspect the indices of the target and test data provided with each iteration of TimeseriesGenerator, you find that the target data comes from time step i
, while the test data comes from time steps i-length
to i-1
, inclusive. There appears to be no way to adjust this offset.
The line
should be changed to
targets[j] = self.targets[indices[-1]]
or something similar.
Here's some sample code that displays the issue.
from __future__ import print_function
from keras.preprocessing.sequence import TimeseriesGenerator
import numpy
target = numpy.zeros((100,4,4), dtype = numpy.float32)
for i in range(0,100):
target[i,...] = i
test = 0 + target
sequence = TimeseriesGenerator(test, target, length = 5, sampling_rate = 1,
stride = 1, start_index = 0, end_index = None,
shuffle = False, batch_size = 32)
epochs = len(sequence)
print('Length of sequence is', epochs)
epoch = 1
for block in sequence:
print('Epoch', epoch)
print(' test data')
print(' shape =', block[0].shape)
print(' elements =', block[0][:,:,2,2])
print(' target data')
print(' shape =', block[1].shape)
print(' elements =', block[1][:,2,2])
epoch += 1
Hi, I'm a maintainer for https://aur.archlinux.org/pkgbase/python-keras-preprocessing/
The latest pip release is 1.0.3 whereas the latest github release is 1.0.2 (albeit with a wrong version number in the setup.py)
Are the pip releases the preferred official release, or should I stick to using the github releases?
Seems like the num_words property in text.py is not initialized with the correct length. I found this out because I'm using this value in order to calculate the number of input/output neurons which leads to issues when I'm training the model.
I think num_words should be initialized like this: num_words = len(self.word_index)
if not set explicitly.
Hi! When I was running my codes, the memory error occurred in fit function. I have changed the type of img1
as float32
in order not to copy x
in x = np.asarray(x, dtype = backend.floatx())
, which is shown in the picture below. Although there are many ways to solve this problem, I am curious about whether x = np.copy(x)
is needed. It seems that an if...else...
statement to decide whether to adjust the order of x can avoid unnecessary memory allocation, especially when x
is a huge matrix.
Many thanks!
The following is the codes from Line 1205 to 1232 in image.py.
x = np.asarray(x, dtype=backend.floatx())
if x.ndim != 4:
raise ValueError('Input to `.fit()` should have rank 4. '
'Got array with shape: ' + str(x.shape))
if x.shape[self.channel_axis] not in {1, 3, 4}:
warnings.warn(
'Expected input to be images (as Numpy array) '
'following the data format convention "' +
self.data_format + '" (channels on axis ' +
str(self.channel_axis) + '), i.e. expected '
'either 1, 3 or 4 channels on axis ' +
str(self.channel_axis) + '. '
'However, it was passed an array with shape ' +
str(x.shape) + ' (' + str(x.shape[self.channel_axis]) +
' channels).')
if seed is not None:
np.random.seed(seed)
x = np.copy(x)
if augment:
ax = np.zeros(
tuple([rounds * x.shape[0]] + list(x.shape)[1:]),
dtype=backend.floatx())
for r in range(rounds):
for i in range(x.shape[0]):
ax[i + r * x.shape[0]] = self.random_transform(x[i])
x = ax
When the image directory has more files than specified in the 'x_col' of the dataframe, the generator generates more images than expected. See the repro.
It might be that I don't understand how it works though :)
In order to test how my model training script performed on a benchmark dataset, I converted the stored MNIST to a set of png images. I have them organized in two ways:
Method 1. I have a "train" folder and a "test" folder where images are stored without further organization. I have a dataframe for the train and test set, with column 1 listing the absolute directory, and column 2 listing the label. I've carefully examined this csv- the labels and image listings appear accurate. I've captured and tested the output of flow_from_dataframe(), and it looks fine.
Sample of csv:
,test_samples,test_labels
0,/path/Data/test/6992.png,7
1,/path/Data/test/1380.png,9
2,/path/Data/test/5817.png,4
3,/path/Data/test/5295.png,5
4,/path/Data/test/5340.png,2
Method 2. I have a train and test folder, each with subdirectories for the different categories of images.
Other than how they're organized, these datasets are otherwise identical.
If I run my script and use flow_from_dataframe() with the assets from Method 1, the highest validation accuracy I can manage ranges from 0.01-0.05. If I run my script with flow_from_directory() using the assets in Method 2, my highest checkpoint is 0.93.
What could be the source of this disparity? Am I misusing flow_from_dataframe()? I'll share my scripts from each approach below. Thanks in advance for any insight.
Method 1: Garbage Validation Accuracy
import pandas as pd
import numpy as np
import keras
from keras_preprocessing.image import ImageDataGenerator
from keras import applications
from keras import optimizers
from keras.models import Model
from keras.layers import Dropout, Flatten, Dense, GlobalAveragePooling2D
from keras import backend as k
from keras.callbacks import ModelCheckpoint, CSVLogger
from keras.applications.vgg16 import VGG16, preprocess_input
# INITIALIZE MODEL
img_width, img_height = 32, 32
model = VGG16(weights = 'imagenet', include_top=False, input_shape = (img_width, img_height, 3))
# freeze all layers
for layer in model.layers:
layer.trainable = False
# Adding custom Layers
x = model.output
x = Flatten()(x)
x = Dense(512, activation='relu')(x)
x = Dropout(0.5)(x)
predictions = Dense(10, activation="softmax")(x)
# creating the final model
model_final = Model(input = model.input, output = predictions)
# compile the model
rms = optimizers.RMSprop(lr=1e-4)
model_final.compile(loss = "categorical_crossentropy", optimizer = rms, metrics=["accuracy"])
# LOAD AND DEFINE SOURCE DATA
#df.column_name = df.column_name.astype(str)
train = pd.read_csv('/path/Data/MNIST_train.csv', index_col=0)
train.train_labels = train.train_labels.astype(str)
val = pd.read_csv('/path/Data/MNIST_test.csv', index_col=0)
val.test_labels = val.test_labels.astype(str)
nb_train_samples = 60000
nb_validation_samples = 10000
batch_size = 60
epochs = 5
# Initiate the train and test generators
train_datagen = ImageDataGenerator()
test_datagen = ImageDataGenerator()
train_generator = train_datagen.flow_from_dataframe(dataframe=train,
directory=None,
x_col='train_samples',
y_col='train_labels',
has_ext=True,
target_size = (img_height,
img_width),
batch_size = batch_size,
class_mode = 'categorical',
color_mode = 'rgb')
validation_generator = test_datagen.flow_from_dataframe(dataframe=val,
directory=None,
x_col='test_samples',
y_col='test_labels',
has_ext=True,
target_size = (img_height,
img_width),
batch_size = batch_size,
class_mode = 'categorical',
color_mode = 'rgb')
# DEFINE CALLBACKS
path = '/path/chk/epoch_{epoch:02d}-valLoss_{val_loss:.2f}-valAcc_{val_acc:.2f}.hdf5'
chk = ModelCheckpoint(path, monitor = 'val_acc', verbose = 1, save_best_only = True, mode = 'max')
logger = CSVLogger('/path/chk/training_log.csv', separator = ',', append=False)
nPlus = 1
samples_per_epoch = nb_train_samples * nPlus
# Train the model
model_final.fit_generator(train_generator,
steps_per_epoch = int(samples_per_epoch/batch_size),
epochs = epochs,
validation_data = validation_generator,
validation_steps = int(nb_validation_samples/batch_size),
callbacks = [chk, logger])
METHOD 2: Stellar Validation Accuracy
from keras import applications
from keras.preprocessing.image import ImageDataGenerator
from keras import optimizers
from keras.models import Model
from keras.layers import Dropout, Flatten, Dense, GlobalAveragePooling2D
from keras import backend as k
from keras.callbacks import ModelCheckpoint, CSVLogger
img_width, img_height = 32, 32
train_data_dir = '/path/Data/categorical_subdirectories/test/'
validation_data_dir = '/path/Data/categorical_subdirectories/train/'
nb_train_samples = 60000
nb_validation_samples = 10000
batch_size = 60
epochs = 10
from keras.applications.vgg16 import VGG16, preprocess_input
model = VGG16(weights = "imagenet", include_top=False, input_shape = (img_width, img_height, 3))
# Freeze the layers which you don't want to train. Here I am freezing the first 5 layers.
for layer in model.layers:
layer.trainable = False
#Adding custom Layers
x = model.output
x = Flatten()(x)
x = Dense(512, activation='relu')(x)
x = Dropout(0.5)(x)
predictions = Dense(10, activation="softmax")(x)
# creating the final model
model_final = Model(input = model.input, output = predictions)
RMSprop = optimizers.RMSprop(lr=1e-4)
# compile the model
model_final.compile(loss = "categorical_crossentropy", optimizer = RMSprop, metrics=["accuracy"])
model_final.summary()
# Initiate the train and test generators with data Augumentation
train_datagen = ImageDataGenerator(preprocessing_function = preprocess_input)
test_datagen = ImageDataGenerator(preprocessing_function = preprocess_input)
train_generator = train_datagen.flow_from_directory(train_data_dir,
target_size = (img_height, img_width),
batch_size = batch_size,
class_mode = "categorical")
validation_generator = test_datagen.flow_from_directory(validation_data_dir,
target_size = (img_height, img_width),
class_mode = "categorical")
# Save the model according to the conditions
path = '/path/chk/epoch_{epoch:02d}-valLoss_{val_loss:.2f}-valAcc_{val_acc:.2f}.hdf5'
chk = ModelCheckpoint(path, monitor = 'val_acc', verbose = 1, save_best_only = True, mode = 'max')
logger = CSVLogger('/path/chk/training log.csv', separator = ',', append=False)
nPlus = 1
samples_per_epoch = nb_train_samples * nPlus
# Train the model
model_final.fit_generator(train_generator,
steps_per_epoch = int(samples_per_epoch/batch_size),
epochs = epochs,
validation_data = validation_generator,
validation_steps = int(nb_validation_samples/batch_size),
callbacks = [chk, logger])
I am trying to use the preprocessing function to take a network sized crop out of inconsistently sized input images instead of resizing to the network size. I have tried to do this using the preprocessing function but found that it is not easily possible. Using Keras 2.2.2
C:\ProgramData\Anaconda3\envs\tensorflow\lib\site-packages\keras_preprocessing\image.py in __init__(self, directory, image_data_generator, target_size, color_mode, classes, class_mode, batch_size, shuffle, seed, data_format, save_to_dir, save_prefix, save_format, follow_links, subset, interpolation) 1665 self.directory = directory 1666 self.image_data_generator = image_data_generator -> 1667 self.target_size = tuple(target_size) 1668 if color_mode not in {'rgb', 'rgba', 'grayscale'}: 1669 raise ValueError('Invalid color mode:', color_mode, TypeError: 'NoneType' object is not iterable
def _get_batches_of_transformed_samples(self, index_array): batch_x = np.zeros( (len(index_array),) + self.image_shape, dtype=backend.floatx()) # build batch of image data for i, j in enumerate(index_array): fname = self.filenames[j] img = load_img(os.path.join(self.directory, fname), color_mode=self.color_mode, target_size=None) x = img_to_array(img, data_format=self.data_format) # Pillow images should be closed after `load_img`, # but not PIL images. if hasattr(img, 'close'): img.close() params = self.image_data_generator.get_random_transform(x.shape) x = self.image_data_generator.apply_transform(x, params) x = self.image_data_generator.standardize(x) width_height_tuple = (self.target_size[1], self.target_size[0]) if (x.shape[1],x.shape[0]) != width_height_tuple: x=cv2.resize(x,width_height_tuple, interpolation=cv2.INTER_AREA) batch_x[i] = x
While looking into this I saw that the preprocessing function runs at the start of standardize, which is after the random transforms are applied. To me this sounds like preprocssing is a bad name since it isn't actually happening first.
I'm working on the MURA dataset by Stanford. I'm trying to load the dataset using Keras's ImageDataGenerator. The data is in the following hierarchy:
The study1_positive
folder contains the images.
ImageDataGenerator.flow_from_directory
cannot be used with this folder structure, therefore I tried using the flow_from_dataframe
method.
However, when run, the code keeps on executing and doesn't stop.
Following is the format of the Pandas DataFrame that I'm passing to the flow_from_directory
method:
I've also tried changing the labels to 'abnormal' and 'normal' in place of 1 and 0, respectively.
Below is the code:
train_imggen = ImageDataGenerator(rescale=1./255, rotation_range=30,
horizontal_flip=True)
train_loader = train_imggen.flow_from_dataframe(traindf, './', shuffle=True,
x_col='path', y_col='label',
color_mode='grayscale',
target_size=(320,320),
class_mode='binary',
batch_size=8)
apply_transform
changes the number of channels from input to output.
image_datagen_args = {
'shear_range': 0.2,
'zoom_range': 0.2,
'width_shift_range': 0.2,
'height_shift_range': 0.2,
'rotation_range': 45,
'horizontal_flip': True,
'vertical_flip': True
}
image_datagen = ImageDataGenerator(**image_datagen_args)
x = np.zeros((32, 32, 1))
params = image_datagen.get_random_transform(x.shape)
x = image_datagen.apply_transform(x, params)
x.shape == (32, 32, 3)
In https://github.com/keras-team/keras-preprocessing/blob/master/setup.py
extras_require={
'tests': ['pytest',
'pytest-pep8',
'pytest-xdist',
'pytest-cov'],
'image': ['scipy>=0.14'],
},
Shouldn't Pillow
also be declared?
I just did a fresh virtual env install and after pip install -r requirements.txt
(which doesn't contain Pillow
because I assume you take care of that) I noticed that my code fails because Pillow
isn't installed.
The full description can be found at:
keras-team/keras#11452
Referencing this: keras-team/keras#10869 (comment)
I've just been able to reproduce this bug using the following script.
from keras.datasets import mnist
from keras.preprocessing.image import ImageDataGenerator
import matplotlib.pyplot as plt
import matplotlib.animation as animation
import numpy as np
(X, y), _ = mnist.load_data()
X = X.reshape(X.shape[0], 1, 28, 28)
X = X[:100]
datagen = ImageDataGenerator(width_shift_range=0.2)
datagen.fit(X)
imgs = []
batches = 0
for i in datagen.flow(X, batch_size=32):
batches += 1
for x in i:
img = np.asarray(x).reshape((28, 28))
imgs.append([plt.imshow(img, cmap='gray', animated=True)])
print"Completed batch : %i" % (batches)
if batches >= len(X) / 32:
break
fig = plt.figure()
ani = animation.ArtistAnimation(fig, imgs, interval=50, blit=True,
repeat_delay=1000)
ani.save("test.gif", writer="imagemagick")
Which produces the following (blank) gif:
(Apologies about the scripting, did it whilst stood up waiting for a meeting!)
If any contributor is feeling up for it, it would be good to add unit tests that check that appropriate error messages are getting raised in various situations not yet covered. It seems that a few error messages were previously incorrectly formatted, because we don't have unit tests for some of these exceptions.
We have some such unit tests already, which look like this:
with pytest.raises(ValueError) as e_info:
generator.flow((images, x_misc_err), np.arange(dsize), batch_size=3)
assert 'All of the arrays in' in str(e_info.value)
Currently keras and keras-preprocessing depends on each other.
Since it's already a separate module with it's own pip package, shouldn't we be able to use keras-preprocessing as an independent tool that does not depends on keras?
Having these two modules mutually depends on each other is causing conflicts at keras-mxnet (a keras fork) awslabs/keras-apache-mxnet#129
I am not sure if this is a bug or intended functionality, but I realized that ImageDataGenerator
s standardize()
method is actually modifying the input passed to it. This can be quite annoying in the context of jupyter notebook, when you want to reuse the raw images again. Consider the following example code:
from keras.preprocessing.image import ImageDataGenerator
datagen = ImageDataGenerator(rescale=2)
images = np.ones((10, 128, 128, 3))
print(images.mean()) # 1.0
images_std = datagen.standardize(images)
print(images.mean(), images_std.mean()) # 2.0 2.0
I tracked this down to the fact, that this function uses the shorthand operator for simple arithmetic operations: https://github.com/keras-team/keras-preprocessing/blob/master/keras_preprocessing/image.py#L1124 The following two functions do exactly the same thing, however the first one does not modify the input while the second does:
def multiply1(x):
x = x * 2
def multiply2(x):
x *= 2
images = np.ones((10, 128, 128, 3))
print(images.mean()) # 1.0
multiply1(images)
print(images.mean()) # 1.0
multiply2(images)
print(images.mean()) # 2.0
Note that the behavior also differs for different standardization steps, e.g. zca_whitening
does not modify the input while rescale
does.
I learned from the manual page of flow_from_directory
, the first argument passed to flow_from_directory
is a directory. Sometimes, it's also convenient to pass the path of images if the images are placed in multiple directories. If we could get flow_from_directory
to accept images in the following format:
/path1/img1.jpg cat
/path2/img2.jpg dog
The first column is the absolute path to the image, and the second column is the class names.
Modify the ImageDataGenerator
class to receive an extra boolean target_size
argument on its constructor and update its methods to produce random crops during training.
See Keras API Design Review at https://docs.google.com/document/d/1zdSsPCxbrCedQgOYqc-Ne6gWzYqIqIqDgExHyThxl1o/edit?usp=sharing
See Keras Issue keras-team/keras#11237 for more details.
seems value should be converted to integer not the keys as per get_config method
index_docs = {k: int(v) for k, v in index_docs.items()}
Dear Keras team,
I need to combine data from multiple directories with exactly same sub directory structure.
Would it be possible to add this feature into flow_from_directory class ?
With my restricted programming knowledge I would simply modify/add some lines before the line below.
keras-preprocessing/keras_preprocessing/image.py
Line 1888 in 2c7ef1d
This package suffers from the same P0 issue as keras-team/keras-applications#28
There is the further complication that this package leverages imports from Keras as soon as the image.py
and sequence.py
files are executed for the first time (due to subclassing of the Sequence
class). This does not prevent us from applying any of the two solutions proposed, though (for the second solution, we would have to use multiple inheritance to get it to work).
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.