Giter Club home page Giter Club logo

ranger-deep-learning-optimizer's Introduction

Ranger-Deep-Learning-Optimizer


Ranger - a synergistic optimizer combining RAdam (Rectified Adam) and LookAhead, and now GC (gradient centralization) in one optimizer.

quick note - Ranger21 is now in beta and is Ranger with a host of new improvements.

Recommend you compare results with Ranger21: https://github.com/lessw2020/Ranger21

Latest version 20.9.4 - updates Gradient Centralization to GC2 (thanks to GC developer) and removes addcmul_ deprecation warnings in PyTorch 1.60.



*Latest version is in ranger2020.py - looking at a few other additions before integrating into the main ranger.py.

What is Gradient Centralization? = "GC can be viewed as a projected gradient descent method with a constrained loss function. The Lipschitzness of the constrained loss function and its gradient is better so that the training process becomes more efficient and stable." Source paper: https://arxiv.org/abs/2004.01461v2
Ranger now uses Gradient Centralization by default, and applies it to all conv and fc layers by default. However, everything is customizable so you can test with and without on your own datasets. (Turn on off via "use_gc" flag at init).

Best training results - use a 75% flat lr, then step down and run lower lr for 25%, or cosine descend last 25%.


Per extensive testing - It's important to note that simply running one learning rate the entire time will not produce optimal results.
Effectively Ranger will end up 'hovering' around the optimal zone, but can't descend into it unless it has some additional run time at a lower rate to drop down into the optimal valley.

Full customization at init:


Ranger will now print out id and gc settings at init so you can confirm the optimizer settings at train time:

/////////////////////

Medium article with more info:
https://medium.com/@lessw/new-deep-learning-optimizer-ranger-synergistic-combination-of-radam-lookahead-for-the-best-of-2dc83f79a48d

Multiple updates: 1 - Ranger is the optimizer we used to beat the high scores for 12 different categories on the FastAI leaderboards! (Previous records all held with AdamW optimizer).

2 - Highly recommend combining Ranger with: Mish activation function, and flat+ cosine anneal training curve.

3 - Based on that, also found .95 is better than .90 for beta1 (momentum) param (ala betas=(0.95, 0.999)).

Fixes: 1 - Differential Group learning rates now supported. This was fix in RAdam and ported here thanks to @sholderbach. 2 - save and then load may leave first run weights stranded in memory, slowing down future runs = fixed.

Installation

Clone the repo, cd into it and install it in editable mode (-e option). That way, these is no more need to re-install the package after modification.

git clone https://github.com/lessw2020/Ranger-Deep-Learning-Optimizer
cd Ranger-Deep-Learning-Optimizer
pip install -e . 

Usage

from ranger import Ranger  # this is from ranger.py
from ranger import RangerVA  # this is from ranger913A.py
from ranger import RangerQH  # this is from rangerqh.py

# Define your model
model = ...
# Each of the Ranger, RangerVA, RangerQH have different parameters.
optimizer = Ranger(model.parameters(), **kwargs)

Usage and notebook to test are available here: https://github.com/lessw2020/Ranger-Mish-ImageWoof-5

Citing this work

We recommend you use the following to cite Ranger in your publications:

@misc{Ranger,
  author = {Wright, Less},
  title = {Ranger - a synergistic optimizer.},
  year = {2019},
  publisher = {GitHub},
  journal = {GitHub repository},
  howpublished = {\url{https://github.com/lessw2020/Ranger-Deep-Learning-Optimizer}}
}

ranger-deep-learning-optimizer's People

Contributors

fparodimoraes avatar lessw2020 avatar mpariente avatar nestordemeure avatar scottclowe avatar sholderbach avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

ranger-deep-learning-optimizer's Issues

How to cite Ranger in a paper?

In my recent paper I used Ranger. I wish to give all the credit the author(s) deserves, but I'm not sure how to properly cite it? Currently I cited the medium article. Should I cite this github repo instead? Thanks.

This overload of addcmul_ is deprecated: addcmul_(Number value, Tensor tensor1, Tensor tensor2)

I get the following warning when using ranger with pytorch 1.6.0

/path/Ranger-Deep-Learning-Optimizer/ranger/ranger.py:138: UserWarning: This overload of addcmul_ is deprecated:
        addcmul_(Number value, Tensor tensor1, Tensor tensor2)
Consider using one of the following signatures instead:
        addcmul_(Tensor tensor1, Tensor tensor2, *, Number value) (Triggered internally at  /pytorch/torch/csrc/utils/python_arg_parser.cpp:766.)
  exp_avg_sq.mul_(beta2).addcmul_(1 - beta2, grad, grad)

Let's revolutionize the AI research field

Hi,
I have a dream and I'll try to share it to you.

But before explaining further, I'll need your brain to analyze this input and output me what you think about it!

Small rant on the inertia of AI research

First of all, thank you for advancing progress in deep learning.

I'm just a random guy that want to implement an AGI (lol) and like many Nlp engeeners, I need HIGHLY accurate neural networks for fundamental NLP tasks (e.g POS tag, NER, dep parsing, Coref resolution, WSD, etc)
They are all not very accurate (often sub 95% F1 score) and their errors add up.

Such limitations make Nlp not yet suitable for many things.
This is why improving the state of the art (which can be observed on paperswithcode.com) is a crucial priority from academicians.

Effectively, many researchers have smart ideas to improve the state of the art and often slightly improve it by:
Having a "standard neural network" for the task and mix with it their new fancy idea.

I talk from knowledge, I've read most papers from state of the art leaderboards from most fundamental NLP tasks.
Almost always they have this common baseline + one idea, theirs.
The common baseline sometimes slowly evolve (e.g now it's often a pre trained model (say BERT) + fine tuning + their idea.

Sorry to say, but "this" is to me retarded
Where "this" mean the fact that by far, most researchers work in isolation, not integrating others ideas (or with such a slow inertia).
I would have wished that state of the art in one Nlp task would be a combination of e.g 50 innovative and complementary ideas from researchers.
You are researchers, do you have an idea why that is the case? If someone actually tried to merge all good complementary and compatible ideas, would they have the best, unmatchable state of the art?
Why facebookresearch, Microsoft, Google don't try the low hanging fruit in addition to producing X new shiny ideas per month, actually try to merge them in a coherent, synergetic manner??
I would like you to tell me what you think of this major issue that slow AI progress.

As an example of such inertia let's talk about Swish, Mish or RAdam :
Those things are incredibly easy to try and see "hey does it give to my neural network free accuracy gains?"
Yet not any paper on state of the art leaderboards has tried Swish, Mish or RAdam despite being soo simple to try (you don't need to change the neural network)
Not even pre trained models where so many papers depend on them (I opened issues for each of them).

Once I know what you think about this research inertia, I'll explain my vision of what needs to be done to fix it.

Released on PyPI

I just released this code on PyPI. It's called asranger (ranger was taken).
So it can be installed with pip install asranger and can be made a hard dependency by other projects on PyPI.
The corresponding code is on my fork.

Too huge step_size at initialization stage

I found that step_size is too high in the initial 5 steps.
The problem is in the code:

if N_sma >= self.N_sma_threshhold:
    step_size = math.sqrt((1 - beta2_t) * (N_sma - 4) / (N_sma_max - 4) * (N_sma - 2) / N_sma * N_sma_max / (N_sma_max - 2)) / (1 - beta1 ** state['step'])
else:
    step_size = 1.0 / (1 - beta1 ** state['step'])

If betas are set to (0.9, 0.999) the internal variables are changed as following:

state['step']| step_size
------------------------------
        1    |     10
        2    |5.26315789
        3    |3.6900369
        4    |2.90782204
        5    |2.44194281
        6    |0.00426327
        7    |0.00524248
        8    |0.00607304
        9    |0.00681674
       10    |0.00750596

Note, that step_size doesn't depend on gradient value and it scales learning_rate.
Thus RAdam aggressively moves weights from their initial values, even if they have a good initialization.

Is it better to set step_size equal to 0 if N_sma < self.N_sma_threshhold?

It makes sense to use it on a batch of 1?

@lessw2020 Thanks for this awesome optimizer. I´m very excited about it!

There is one particular workload that trains using a batch of 1 item.
Theoretically, make sense to use RAdam (Rectified Adam), LookAhead, and GC in this context?

I´m thinking about it, read the papers but I still could not make a conclusion. As you (or any other person here) is much more experienced than me, do you have an option on this?

cannot load trained model using Ranger

Hi There!

Thanks for putting together this code for Rectified Adam with Lookahead optimizer. I used this optimization function to train my model with fastai and successfully trained the model.

I exported the model using

feature = 'silhouette'
learn.export(f'{feature}_efficientnet-b3.pkl')

and later during inference I am trying to load the learner using

from ranger import Ranger
feature = 'silhouette'
learn = load_learner(path = model_path, file = f'{feature}_efficientnet-b3.pkl')

I have defined the model path properly in the previous cells. But for some reason, I cannot load the learner. The file cannot locate the module ranger.ranger. Can someone please help me fix this issue?

Here's a screenshot of the error for your reference.
Screenshot from 2020-04-02 15-18-02

Thanks & Regards,
Vinayak.

Add manual synchronization function

Hello. First of all, thank you for sharing code and experiment results.
Reading the code, I found that the model will use fast weights to infer. According to LookAhead, fast weights (before synchronization) may perform worse than slow weights. By chance of (1-1/k) probability (80% when k=5), we will use unsynchronized fast weights to validate/test. Therefore, it should be better if we manually synchronize before evaluation.

Not able to save the model_state_dict.

Hi I was trying to save the model checkpoints after each epoch using the below code.But only the state dictionary of the zeroth epoch got stored and none of the others.Does the ranger optimiser object support state_dict ? If yes then how can I save it after each epoch?

out_model = os.path.join(args.model_dir, 'model.th') with open(out_model, 'wb') as f: torch.save(model.state_dict(), f) print("Model is dumped")

Is there a publication of Ranger?

I want to cite ranger on a Medium article and I would like to know if there is an arXiv publication of Ranger or a published peer-reviewed paper on some conference or journal.

I saw you linked a paper o the README.md, but it does not seem to be about ranger, as the very word does not appear in any part of it. I know the Radam and Lookahead paper, but the Ranger one is missing on my library. Thanks

How to use ranger in keras? Please help me.

Your optimizer looks like a big achievement!
I have used " optimizer=Ranger(lr=0.001)" in keras .
But I have a error named "TypeError: init() missing 1 required positional argument: 'params'".
I don't know how to debug it . Can you help me?

N_sma_threshhold

You first have
if N_sma > self.N_sma_threshhold:

and then you have
if N_sma > 4:

Is it right that the second one is constant or should that also be N_sma_threshhold parameter?

Please note in the documentation (or in the constructor) that closures must be enabled

Hi,

I had today a relatively long debug session, after I've upgraded my Pytorch Lightning installation, that the training_step wasn't called.

It finally turned out, that the problem was that the "closure" argument is not used in the step function (it is commented out - as also noted in the source code).

However, as it is apparently required by some libraries and is also recommended by the official PyTorch guidelines, it would be great if it would be better documented, that people might need to enable these lines.

Thanks in advance.

RangerVA with GC

Hello,

Thank you for your work on these optimizers btw. I was testing a couple out and was performing quite well with the RangerVA originally. Then, when your gradient centralization was added I got further improvements but it also seemed to be overtraining the train set more easily despite using the same parameters. Therefore, I tried to implement combining the gradient centralization into the RangerVA algorithm and so far it seems to be performing quite well and faster since it seems I can use larger batch sizes. I was wondering if you could quickly check, whenever you have some free time, if I implemented correctly in the code below since you are so used to this optimizer.

Best

``
class RangerVA(Optimizer):

def __init__(self, params, lr=1e-3, 
             alpha=0.5, k=6, n_sma_threshhold=5, betas=(.95,0.999), 
             eps=1e-5, weight_decay=0, amsgrad=True, transformer='softplus', smooth=50,
             grad_transformer='square',use_gc=True, gc_conv_only=False):
    #parameter checks
    if not 0.0 <= alpha <= 1.0:
        raise ValueError(f'Invalid slow update rate: {alpha}')
    if not 1 <= k:
        raise ValueError(f'Invalid lookahead steps: {k}')
    if not lr > 0:
        raise ValueError(f'Invalid Learning Rate: {lr}')
    if not eps > 0:
        raise ValueError(f'Invalid eps: {eps}')

    #prep defaults and init torch.optim base
    defaults = dict(lr=lr, alpha=alpha, k=k, step_counter=0, betas=betas, 
                    n_sma_threshhold=n_sma_threshhold, eps=eps, weight_decay=weight_decay,
                    smooth=smooth, transformer=transformer, grad_transformer=grad_transformer,
                   amsgrad=amsgrad,use_gc=use_gc, gc_conv_only=gc_conv_only )
    super().__init__(params,defaults)

    #adjustable threshold
    self.n_sma_threshhold = n_sma_threshhold   

    #look ahead params
    self.alpha = alpha
    self.k = k 

    #radam buffer for state
    self.radam_buffer = [[None,None,None] for ind in range(10)]
    
    #gc on or off
    self.use_gc=use_gc
    #level of gradient centralization
    self.gc_gradient_threshold = 3 if gc_conv_only else 1
    print(f"Ranger optimizer loaded. \nGradient Centralization usage = {self.use_gc}")
    if (self.use_gc and self.gc_gradient_threshold==1):
        print(f"GC applied to both conv and fc layers")
    elif (self.use_gc and self.gc_gradient_threshold==3):
        print(f"GC applied to conv layers only")


def __setstate__(self, state):
    print("set state called")
    super(RangerVA, self).__setstate__(state)


def step(self, closure=None):
    loss = None
    #Evaluate averages and grad, update param tensors
    for group in self.param_groups:
        for p in group['params']:
            if p.grad is None:
                continue
            grad = p.grad.data.double()
            if grad.is_sparse:
                raise RuntimeError('Ranger optimizer does not support sparse gradients')
            
            amsgrad = group['amsgrad']
            smooth = group['smooth']
            grad_transformer = group['grad_transformer']

            p_data_fp32 = p.data.double()

            state = self.state[p]  #get state dict for this param

            if len(state) == 0:   
                state['step'] = 0
                state['exp_avg'] = torch.zeros_like(p_data_fp32)
                state['exp_avg_sq'] = torch.zeros_like(p_data_fp32)
                if amsgrad:
                    # Maintains max of all exp. moving avg. of sq. grad. values
                    state['max_exp_avg_sq'] = torch.zeros_like(p.data)                    

                #look ahead weight storage now in state dict 
                state['slow_buffer'] = torch.empty_like(p.data)
                state['slow_buffer'].copy_(p.data)

            else:
                state['exp_avg'] = state['exp_avg'].type_as(p_data_fp32)
                state['exp_avg_sq'] = state['exp_avg_sq'].type_as(p_data_fp32)
                                  

            #begin computations 
            exp_avg, exp_avg_sq = state['exp_avg'], state['exp_avg_sq']
            beta1, beta2 = group['betas']
            if amsgrad:
                max_exp_avg_sq = state['max_exp_avg_sq']  
                # Maintains the maximum of all 2nd moment running avg. till now
                torch.max(max_exp_avg_sq, exp_avg_sq, out=max_exp_avg_sq)
                # Use the max. for normalizing running avg. of gradient
                denomc = max_exp_avg_sq.clone()
            else:
                denomc = exp_avg_sq.clone()
            #GC operation for Conv layers and FC layers       
            if grad.dim() > self.gc_gradient_threshold:                    
                grad.add_(-grad.mean(dim = tuple(range(1,grad.dim())), keepdim = True))

            state['step'] += 1              

            #compute variance mov avg
            exp_avg_sq.mul_(beta2).addcmul_(1 - beta2, grad, grad)
            #compute mean moving avg
            exp_avg.mul_(beta1).add_(1 - beta1, grad)
            buffered = self.radam_buffer[int(state['step'] % 10)]
            if state['step'] == buffered[0]:
                N_sma, step_size = buffered[1], buffered[2]
            else:
                buffered[0] = state['step']
                beta2_t = beta2 ** state['step']
                N_sma_max = 2 / (1 - beta2) - 1
                N_sma = N_sma_max - 2 * state['step'] * beta2_t / (1 - beta2_t)
                buffered[1] = N_sma
                if N_sma > self.n_sma_threshhold:
                    step_size = math.sqrt((1 - beta2_t) * (N_sma - 4) / (N_sma_max - 4) * (N_sma - 2) / N_sma * N_sma_max / (N_sma_max - 2)) / (1 - beta1 ** state['step'])
                else:
                    step_size = 1.0 / (1 - beta1 ** state['step'])
                buffered[2] = step_size

            
            ##transformer
            if grad_transformer == 'square':
                grad_tmp = grad**2
                denomc.sqrt_() 
            elif grad_transformer == 'abs':
                grad_tmp = grad.abs()


            exp_avg_sq.mul_(beta2).add_((1 - beta2)*grad_tmp)

            if group['weight_decay'] != 0:
                p_data_fp32.add_(-group['weight_decay'] * group['lr'], p_data_fp32)
            bias_correction1 = 1 - beta1 ** state['step']
            bias_correction2 = 1 - beta2 ** state['step']
            step_size = group['lr'] * math.sqrt(bias_correction2) / bias_correction1                

            
            # ...let's use calibrated alr 
            if N_sma > self.n_sma_threshhold:
                if  group['transformer'] =='softplus':
                    sp = torch.nn.Softplus( smooth)
                    denomf = sp( denomc)
                    p_data_fp32.addcdiv_(-step_size, exp_avg, denomf )
                else:
                    denom = exp_avg_sq.sqrt().add_(group['eps'])
                    p_data_fp32.addcdiv_(-step_size * group['lr'], exp_avg, denom)
            else:
                p_data_fp32.add_(-step_size * group['lr'], exp_avg)
            p.data.copy_(p_data_fp32)

            #integrated look ahead...
            #we do it at the param level instead of group level
            if state['step'] % group['k'] == 0:
                slow_p = state['slow_buffer'] #get access to slow param tensor
                slow_p.add_(self.alpha, p.data - slow_p)  #(fast weights - slow weights) * alpha
                p.data.copy_(slow_p)  #copy interpolated weights to RAdam param tensor

    return loss

larger learning rate + large weight decay performs better?

Hi all,
My colleague and I tried a combination of (relatively) large Ranger learning rate (say, 0.001) + large weight decay (say, 0.1). Seems the large decay leads to better performance? We tried two different models, and observed 0.5-1.5% increase of ImageNet classification accuracy, but both models were customized models, and not standard ones like Resnet.
Not sure whether anyone else finds similar results.

Making it a python package

Would you like to make this a python package that could be installed with pip? It would be more practical.

I'd like to include it in my repo asteroid and give you proper credit for it.

One way is to install a python package (I can make a PR for that), the other one would be to copy-paste some of the code and point to the license file. Which way would you prefer?

TypeError in GC operation for Conv layers and FC layers

TypeError: mean() received an invalid combination of arguments - got (keepdim=bool, dim=tuple, ), but expected one of:

  • ()
  • (torch.dtype dtype)
  • (int dim, torch.dtype dtype)
    didn't match because some of the keywords were incorrect: keepdim
  • (int dim, bool keepdim, torch.dtype dtype)
  • (int dim, bool keepdim)
    didn't match because some of the arguments have invalid types: (dim=tuple, keepdim=bool, )

Loss stuck after 1 epoch

Just a warning to the curious, I tried to train DCCRN (from https://github.com/mpariente/asteroid) with Ranger2020 (default params) and it was stuck at a large loss after less than 1 epoch, and loss did not improve for another 30 epochs. I did not debug further. Adam with default params works very well.

Ranger and pytorch DDP

I tried ranger vs adamw on single and 8 gpu setup, while ranger better on single gpu, on DDP setup it performe worse, any advises?

Spelling, variables and PEP8

Hi and thanks for the code!

I am using your script for my code and while adapting it to PEP8 specs I found a few details that you may want to change. These are style changes that add clarity, but of course it is up to you whether to adhere to PEP8 recommendations or not. I could prepare a pull request as well if you like this style.

required (from torch.optim.optimizer) is not used
itertools is not used

N_sma_threshhold <-- the variable name should not begin with a capital letter, also "threshold" has a typo

k <-- is an importan variable with an obscure name, perhaps something like "lookahead_steps" would be more clear?

Multi-line comments should use """comment"""
Normal comments need a space after #

betas=(.95,0.999) (and others) need a space after the coma

You have commented code that is not used, perhaps it would be best to remove it altogether.

Spaces are not consistent and don't agree with PEP8.

Does it works well for transformer?

I am working on transformer now.
#13 I see this issue, but no one said they get a better result than AdamW yet.
Anyone have already make ranger work well in transformer by fine-tunning?

Also, I do not understand the Readme: 'Best training results - use a 75% flat lr, then step down and run lower lr for 25%, or cosine descend last 25%.'
I use 1e-4 lr now, what is the '75% flat lr'?
What is 'lower lr for 25%'?
Could you show me some demo code about how to adjust the lr expect for the code init the Ranger?

Not working using cuda

Variables self.slow_weights are always on cpu.
You can easily fix this by adding a .to() method in Ranger class like so:

def to(self, device):    
    if device is "cuda":
        for i in range(len(self.slow_weights)):
            for j, w in enumerate(self.slow_weights[i]):
                self.slow_weights[i][j] = w.cuda()
    elif device is "cpu":
        for i in range(len(self.slow_weights)):
            for j, w in enumerate(self.slow_weights[i]):
                self.slow_weights[i][j] = w.cpu()

Loading state doesn't seem to be fully working

To save : 'optimizer' : optimizer.state_dict()

optimizer.load_state_dict(checkpoint['optimizer'])

However, I have the impression restarting the training always bring the accuracy down and then it recovers.

Best,
Thomas Chaton>

Did you try to fine-tune transformers LM with Ranger?

Recent transformers architectures are very famous in NLP: BERT, GPT-2, RoBERTa, XLNET. Did you try to fine-tune them on some NLP task? If so, what was the best Ranger hyper-parameters and learning rate scheduler?

N_sma_threshhold should be instance variable

Thank you for the great implementation.
I think I found a small part to modify at ranger.py line 116.

original code:
if N_sma > N_sma_threshhold:

to be left:
if N_sma > self.N_sma_threshhold:

step_counter not set

Hi,
thanks for your work.

I just plugged it into my model and found that step_counter was not set for all param_groups.

I fixed it with this hack:

        #look ahead tracking and updating if latest batch = k
        for group,slow_weights in zip(self.param_groups,self.slow_weights):
            if 'step_counter' not in group:
                group["step_counter"] = 0

but I suspect it's not optimal...
this would mean that self.param_groups changed between the constructor and step(), but I have no idea why. Have you seen something similar before?

Thanks

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.