lucidrains / bs-roformer Goto Github PK
View Code? Open in Web Editor NEWImplementation of Band Split Roformer, SOTA Attention network for music source separation out of ByteDance AI Labs
License: MIT License
Implementation of Band Split Roformer, SOTA Attention network for music source separation out of ByteDance AI Labs
License: MIT License
In the current implementation, the forward()
method is generic for train or eval mode. In some case, we need to have not only the loss but the prediction on output that allow to compute extra features like the SDR metric during the validation step.
Because the loss function code is common for BSRoformer
and MelBandRoformer
classes, maybe that can be better create a new class like MultiResLoss
for a maximum of flexibility:
import torch
import torch.nn.functional as F
from einops import rearrange
from beartype import beartype
from beartype.typing import Tuple
class MultiResLoss():
@beartype
def __init__(
self,
num_stems,
stft_n_fft,
multi_stft_resolution_loss_weight = 1.,
multi_stft_resolutions_window_sizes: Tuple[int, ...] = (4096, 2048, 1024, 512, 256),
multi_stft_hop_size = 147,
multi_stft_normalized = False
):
self.num_stems = num_stems
self.multi_stft_resolution_loss_weight = multi_stft_resolution_loss_weight
self.multi_stft_resolutions_window_sizes = multi_stft_resolutions_window_sizes
self.multi_stft_n_fft = stft_n_fft
self.multi_stft_kwargs = dict(
hop_length = multi_stft_hop_size,
normalized = multi_stft_normalized
)
def __call__(
self,
predict,
targets,
return_loss_breakdown = False
):
if self.num_stems > 1:
assert targets.ndim == 4 and targets.shape[1] == self.num_stems
if targets.ndim == 2:
targets = rearrange(targets, '... t -> ... 1 t')
targets = targets[..., :predict.shape[-1]] # protect against lost length on istft
loss = F.l1_loss(predict, targets)
multi_stft_resolution_loss = 0.
for window_size in self.multi_stft_resolutions_window_sizes:
res_stft_kwargs = dict(
n_fft = max(window_size, self.multi_stft_n_fft), # not sure what n_fft is across multi resolution stft
win_length = window_size,
return_complex = True,
**self.multi_stft_kwargs,
)
predict_Y = torch.stft(rearrange(predict, '... s t -> (... s) t'), **res_stft_kwargs)
targets_Y = torch.stft(rearrange(targets, '... s t -> (... s) t'), **res_stft_kwargs)
multi_stft_resolution_loss = multi_stft_resolution_loss + F.l1_loss(predict_Y, targets_Y)
weighted_multi_resolution_loss = multi_stft_resolution_loss * self.multi_stft_resolution_loss_weight
total_loss = loss + weighted_multi_resolution_loss
if not return_loss_breakdown:
return total_loss
return total_loss, (loss, multi_stft_resolution_loss)
In the same spirit, a little refactoring could be to create a new file for the common classes :
- RMSNorm
- FeedForward
- Attention
- Transformer
- BandSplit
- MLP
- MaskEstimator
That can be easier for future change in the code?
Hi,
Has anyone successfully reproduced the results from the paper? I am interested in doing more experiments with this architecture.
Thanks!
Hello @lucidrains
I have just open sourced my Mel-band Roformer vocal model, it performs slightly better than the authors model because I trained it with more data. I also included a google colab for people if they want to use it.
https://github.com/KimberleyJensen/Mel-Band-Roformer-Vocal-Model
Thank you again for implementing this paper !!
I am a bit confused of the gates
in the Attention module of the bs_roformer.py
. The code in lines 103-105 is
out = self.attend(q, k, v)
gates = self.to_gates(x)
out = out * rearrange(gates, 'b n h -> b h n 1').sigmoid()
out = rearrange(out, 'b h n d -> b n (h d)')
return self.to_out(out)
From my understanding this is not the standard multi head attention approach and the paper has not mentioned anything about using something else. Therefore, I would remove the parts using gates
resulting in the following code:
out = self.attend(q, k, v)
out = rearrange(out, 'b h n d -> b n (h d)')
out = self.to_out(out)
What got I wrong? What is the gates
for and why is it used can you clear this up?
Hello,
Firstly, thank you for making the code publicly available. I am attempting to reproduce the results of the BS-RoFormer paper to achieve similar SDR performance and am referencing your code to do so. However, I am encountering significantly lower performance during reproduction and have a few questions. I would appreciate it if you could answer the following queries:
The hyperparameters I used are as follows:
dim: 384
depth: 6
stereo: True
num_stems: 1
time_transformer_depth: 2
freq_transformer_depth: 2
dim_head: 64
heads: 8
ff_dropout: 0.1
attn_dropout: 0.1
flash_attn: True
mask_estimator_depth: 2
Here are my questions:
Could you please share the model and training hyperparameters you used during training?
The paper mentions using a complex spectrogram as input, but I noticed the code uses torch.view_as_real to handle the input in a CaC manner. I believe this is different from the paper. Could you explain the reason for this difference?
I am running the training on an H100 80GB GPU with the above hyperparameters. Despite slight differences from the paper's hyperparameters, the batch_size of 4 fills up the 80GB memory. Could you let me know what batch_size you used and if there were any additional steps taken to manage memory efficiency? For reference, I used 44.1kHz 8-second audio for both target and mixture inputs, as per the paper's setup.
Your answers would be greatly helpful. Thank you very much!
Hello. I tried to run a training.
model = MelBandRoformer(
stereo=True,
dim=32,
depth=1,
attn_dropout=0.1,
time_transformer_depth=1,
freq_transformer_depth=1,
)
But I got an error:
RuntimeError: split_with_sizes expects split_sizes to sum exactly to 3964 (input tensor's size at dimension -1), but got split_sizes=[24, 24, 24, 24, 24, 24, 24, 24, 24, 24, 24, 24, 24, 24, 24, 24, 24, 28, 32, 32, 32, 36, 40, 44, 44, 44, 52, 56, 60, 64, 64, 68, 76, 80, 84, 92, 100, 104, 112, 120, 124, 136, 148, 156, 164, 176, 188, 200, 216, 228, 244, 264, 280, 296, 316, 340, 364, 388, 412, 440, 472, 504]
Without stereo=True it works normally.
Hi, according to the paper,
We do not use the In-House dataset for ablation study. The effective batch size is 64 (i.e., 4 for each GPU) using accumulate grad batches=2.
and
Our model takes a segment of 8-seconds waveform for input and output.
for the L=6 model.
Based on my understanding, they train every 2 audio segments with 8s on each GPU without accumulate batch.
However, I hardly to fit in while I am only training with one V100.
Another thing is the size of the model
In the paper, they mentioned
The numbers of parameters for BS-RoFormer and BS-Transformer with L=6 are 72.2M and 72.5M.
While I following the default setting from the paper, my model is 108M
model = Model(
dim=384,
depth=6,
time_transformer_depth=1,
freq_transformer_depth=1,
heads=8,
attn_dropout=0.1,
ff_dropout=0.1,
dim_head=48,
stereo=True
)
Anything wrong?
I noticed error in Linear Attention layer when flash_attn
is set in True
:
File "attend.py", line 84, in flash_attn
out = F.scaled_dot_product_attention(
RuntimeError: No available kernel. Aborting execution.
In standard Attention (time_transformer
and freq_transformer
) it works ok.
So currently as workaround I did followig change in LinearAttention
(set False to flash):
self.attend = Attend(
scale=scale,
dropout=dropout,
flash=False
)
In readme you use the following parameters as example:
time_transformer_depth = 1
freq_transformer_depth = 1
However, these are the defaults in the BSRoformer class:
time_transformer_depth = 2
freq_transformer_depth = 2
What would you recommend?
hello,
I have two question about the Linear Attention that was added later. Can you clear me up why it is called Linear Attention when the referenced paper introduces Cross-Covariance Attention and why exactly is it better than Self-Attention?
Hello, there is a problem with input data. You write
x = torch.randn(2, 131680)
target = torch.randn(2, 131680)
loss = model(x, target = target)
loss.backward()
# after much training
out = model(x)
But in reality there must be also some batch size:
x = torch.randn(10, 2, 131680)
target = torch.randn(10, 2, 131680)
loss = model(x, target = target)
loss.backward()
# after much training
out = model(x)
If I use like in bottom example I have a problem with stft.
RuntimeError: stft(torch.cuda.FloatTensor[6, 2, 133728], n_fft=2048, hop_length=512, win_length=2048, window=None, normalized=0, onesided=None, return_complex=1) : expected a 1D or 2D tensor
There is a problem with model if you try to use multiGPU:
Traceback (most recent call last):
File "train_local_run.py", line 44, in <module>
train_model(args)
File "train.py", line 239, in train_model
loss = model(x, y)
File "\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "\site-packages\torch\nn\parallel\data_parallel.py", line 171, in forward
outputs = self.parallel_apply(replicas, inputs, kwargs)
File "\site-packages\torch\nn\parallel\data_parallel.py", line 181, in parallel_apply
return parallel_apply(replicas, inputs, kwargs, self.device_ids[:len(replicas)])
File "\site-packages\torch\nn\parallel\parallel_apply.py", line 89, in parallel_apply
output.reraise()
File "\site-packages\torch\_utils.py", line 644, in reraise
raise exception
StopIteration: Caught StopIteration in replica 0 on device 0.
Original Traceback (most recent call last):
File "\site-packages\torch\nn\parallel\parallel_apply.py", line 64, in _worker
output = module(*input, **kwargs)
File "\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "\models\bs_roformer\mel_band_roformer.py", line 437, in forward
stft_window = self.stft_window_fn(device=self.device)
File "\models\bs_roformer\mel_band_roformer.py", line 403, in device
return next(self.parameters()).device
StopIteration
What is the correct way to fix it?
I tried to train this neural net without any success. SDR stuck in around 2.1 for vocals and never grows more. If somebody have better results please let me know.
Hi, before starting training on a large corpus, I've just test the latest release 3.0 and the new MelBand model. Both models are initialized with the same parameters, no change on others parameters.
model = MelBandRoformer(dim = 384,
depth = 9,
time_transformer_depth = 1,
freq_transformer_depth = 1,
stereo=stereo,
sample_rate=sr).to(device)
For in-situation condition, I load a audio stereo mixture of 8 seconds @ 44100 Hz with a simil target (a drums stem). The batch size is 1, so the tensors are of size : [1, 2, 352800]
After calling a single training forward step, the backward loss seem coherent:
BandSplit : tensor(2.3477, device='cuda:0', grad_fn=<AddBackward0>)
MelBand : tensor(2.2762, device='cuda:0', grad_fn=<AddBackward0>)
Unfortunatly, when I save back the audio of both models outputs, I got a strange behavior in the MelBand model. Better than words, the spectrograms:
The spectrogram show that the MelBand model output cut the frequencies above 11025 Hz, so half of the Nyquist frequency of 22050 Hz for a 44100 Hz audio.
I don't know if it's normal or a bug, but I prefer to share the information here.
Thank's so much for BS-RoFormer !!!
Thanks for your great work.
In the BS-RoFormer paper, the authors mention:
Each MLP layer consists of a RMS Norm layer, a fully connected layer followed by a Tanh activation, and a fully connected layer followed by a gated linear unit (GLU) layer[29].
However, your MaskEstimator implementation is somewhat different from the paper description.
Here's my implementation (I also checked the number of parameters):
class TanH(Module):
def __init__(self, dim_in, dim_out):
super().__init__()
self.proj = nn.Linear(dim_in, dim_out)
def forward(self, x):
x = self.proj(x)
return x.tanh()
class GLU(Module):
def __init__(self, dim_in, dim_out):
super().__init__()
self.proj = nn.Linear(dim_in, dim_out * 2)
def forward(self, x):
x, gate = self.proj(x).chunk(2, dim=-1)
return x * gate.sigmoid()
class MaskEstimator(Module):
@beartype
def __init__(
self,
dim,
dim_inputs: Tuple[int, ...],
mlp_expansion_factor = 4,
):
super().__init__()
self.dim_inputs = dim_inputs
self.to_freqs = ModuleList([])
for dim_in in dim_inputs:
net = []
net.append(TanH(dim, dim * ff_mult))
net.append(GLU(dim * ff_mult, dim_in))
self.to_freqs.append(nn.Sequential(*net))
def forward(self, x):
x = x.unbind(dim=-2)
outs = []
for band_features, to_freq in zip(x, self.to_freqs):
freq_out = to_freq(band_features)
outs.append(freq_out)
return torch.cat(outs, dim=-1)
Hi! Amazing work!
Any chance it can run on MPS (Metal Performance Shaders) - Apple GPU?
In the paper second sentence in 4.4 configuration they mention:
The multi-band mask estimation module utilizes MLPs with a hidden layer dimension of 4D.
But in the actual code from my understanding there is no hidden layer only 2 linear layers smart combined to a single one.
I am still a bit confused because in the code is the following comment
in line 241 stft_hop_length = 512, # 10ms at 44100Hz, from sections 4.1, 4.4 in the paper - @faroit recommends // 2 or // 4 for better
It's true that they are mentioned to use a hop_size of 10ms for a 44.1kHz sample rate, but in my calculation 512 is not 10ms but ~11.6ms
Did I missunderstood something or what is the intention here ?
Hi @lucidrains
I'm working on a training code to share here and attempt to reproduce the same workflow than the SAMI-ByteDance paper. I will share the code here. In order to make the training code generic, and because I'm pretty newbie in this kind of task, I want to be sure to clearly understand your code.
EDIT: I remove this very stupid assumption.
This is a less important, but for better following of the code:
We use D = 384 for feature dimension, L = 12 for the number of Transformer blocks, 8 heads for each Transformer, and a dropout rate of 0.1.
I don't know if the choice of attn_dropout = 0.
and ff_dropout = 0.
as default parameters for the Transformers
is motivated by some technical considerations. If not, maybe it's better to change the values to follow the original paper?
We use 60 Mel-bands, as it is similar to the number of subbands, i.e., 62, adopted by BS-RoFormer.
Same here with the default parameter num_bands = 62
of the MelBandRoformer
?
Thank you very much to the teacher for providing such good sound source separation. For a junior student like me, can you provide a specific tutorial on how to separate the sound source? I am using win10 system 64 bit
I am a novice in this field and would like to complete the task of audio source separation. This project seems to be for that purpose, but I couldn't understand how to use it for audio source separation based on the examples in the documentation.
I am confused about why the tensor length in the readme is 131680, from the paper I got that they use 8s of audio with 44.1kHz which makes from my understand 352800 😅
Thank you very much for your code. You rock!
Is Flash Attention only supported by A100 GPU ?
First of all, thanks for all the good work (including in this repo)!
There's a potential bug in LinearAttention that I thought I should bring to your attention: In contrast to the original paper, you learn the logarithmic temperature, but still initialize the parameter with ones instead of zeros. This means the temperature will initially be Euler's number. Maybe it doesn't make much of a difference in practice or works even better, but it looks like it may have been unintentional.
Hi, let me introduce this little benchmark of the model running on GPU (device='cuda'
):
---------- Training ----------
Time: 0.019369 seconds in "STFT" stage.
Time: 0.704290 seconds in "Band Split" stage.
Time: 0.124404 seconds in "Transformer" stage.
Time: 0.017830 seconds in "Mask Estimator" stage.
Time: 6.671448 seconds in "ISTFT" stage.
--------------------------------
Time: 0.147087 seconds in "Multiresolution Loss" stage.
Time: 0.493661 seconds in "Loss Backward" stage.
--------------------------------
Time: 9.102034 seconds for the whole process.
---------- Evaluation ----------
Time: 0.000107 seconds in "STFT" stage.
Time: 0.003609 seconds in "Band Split" stage.
Time: 0.015426 seconds in "Transformer" stage.
Time: 0.009862 seconds in "Mask Estimator" stage.
Time: 3.528062 seconds in "ISTFT" stage.
--------------------------------
Time: 3.565335 seconds for the whole process.
The model specifications follow the original paper:
model = BSRoformer(dim = 384,
depth = 12,
time_transformer_depth = 1,
freq_transformer_depth = 1,
mask_estimator_depth = 4).to(device)
The tensors for testing are initialized using:
audio_size = 4 * 44100 # 4 seconds @ 44.1 KHz
sample = torch.randn(2, audio_size, dtype=torch.float32).to(device)
target = torch.randn(2, audio_size, dtype=torch.float32).to(device)
The benchmark do not include the einops operations, nor the other tensor manipulation but bounds the stages of the model like that:
self.bench.start('Band Split')
x = self.band_split(x)
self.bench.stop()
# axial / hierarchical attention
self.bench.start('Transformer')
for time_transformer, freq_transformer in self.layers:
x = rearrange(x, 'b t f d -> b f t d')
x, ps = pack([x], '* t d')
x = time_transformer(x)
x, = unpack(x, ps, '* t d')
x = rearrange(x, 'b f t d -> b t f d')
x, ps = pack([x], '* f d')
x = freq_transformer(x)
x, = unpack(x, ps, '* f d')
x = self.final_norm(x)
self.bench.stop()
num_stems = len(self.mask_estimators)
self.bench.start('Mask Estimator')
mask = torch.stack([fn(x) for fn in self.mask_estimators], dim = 1)
self.bench.stop()
The Benchmark
class is a pretty trivial one:
from time import perf_counter
class Benchmark():
def __init__(self):
pass
def start(self, stage):
self.stage = stage
self.time_start = perf_counter()
def stop(self):
self.time_duration = perf_counter() - self.time_start
print(f'Time: {self.time_duration:.6f} seconds in "{self.stage}" stage.')
66% of the time in the model is lost in the torch.istft
process while the torch.stft
is not slow at all.
Am I the only one to notice this?
Wrong conclusion, see next message.
Is it possible to reduce some aspects of Melband to give good results still but be able to train on smaller GPUs?
I have tried finetuning Kimberley's model with my 3090 but with each epoch the results get slightly worse. This is with only audio chunk size reduced to about half and batch size 1-2 (regardless of accum grad)
After separating STFT and ISTFT from the BSRoformer
class, I was able to successfully export the model to ONNX, and trtexec
could convert the ONNX model to a TensorRT engine. However, TensorRT did not accelerate the inference; instead, it was twice as slow compared to the Torch implementation.
Torch takes approximately 0.13 seconds to infer a slice, while TensorRT takes 0.27 seconds to infer the same slice (tested on an RTX 4090). Using NVIDIA Nsight for monitoring, the preliminary analysis suggests that the slowdown is caused by the Tile operation. Is there any way to alleviate this issue in TensorRT without retraining the model?
Nsight Result:
Modified source code:
https://github.com/bfloat16/Music-Source-Separation-Training
I'm trying to reproduce the paper model. But I have no luck.
My current settings which gave me batch size only 2 for 48 GB memory:
dim: 192
depth: 8
stereo: true
num_stems: 1
time_transformer_depth: 1
freq_transformer_depth: 1
num_bands: 60
dim_head: 64
heads: 8
attn_dropout: 0.1
ff_dropout: 0.1
flash_attn: True
dim_freqs_in: 1025
sample_rate: 44100 # needed for mel filter bank from librosa
stft_n_fft: 2048
stft_hop_length: 512
stft_win_length: 2048
stft_normalized: False
mask_estimator_depth: 2
multi_stft_resolution_loss_weight: 1.0
multi_stft_resolutions_window_sizes: !!python/tuple
- 4096
- 2048
- 1024
- 512
- 256
multi_stft_hop_size: 147
multi_stft_normalized: False
On input I give 8 seconds of 44100Hz so length is 352800.
I run my code model through torchinfo:
from torchinfo import summary
summary(model, input_size=(1, 2, 352768))
Report is:
==============================================================================================================
Layer (type:depth-idx) Output Shape Param #
==============================================================================================================
MelBandRoformer [1, 2, 352768] 56,503,768
├─ModuleList: 1-1 -- --
│ └─ModuleList: 2-1 -- 384
│ │ └─Transformer: 3-77 [60, 690, 192] (recursive)
│ │ └─Transformer: 3-78 [690, 60, 192] (recursive)
│ └─ModuleList: 2-2 -- 384
│ │ └─Transformer: 3-79 [60, 690, 192] (recursive)
│ │ └─Transformer: 3-80 [690, 60, 192] (recursive)
│ └─ModuleList: 2-3 -- 384
│ │ └─Transformer: 3-81 [60, 690, 192] (recursive)
│ │ └─Transformer: 3-82 [690, 60, 192] (recursive)
│ └─ModuleList: 2-4 -- 384
│ │ └─Transformer: 3-83 [60, 690, 192] (recursive)
│ │ └─Transformer: 3-84 [690, 60, 192] (recursive)
│ └─ModuleList: 2-5 -- 384
│ │ └─Transformer: 3-85 [60, 690, 192] (recursive)
│ │ └─Transformer: 3-86 [690, 60, 192] (recursive)
│ └─ModuleList: 2-6 -- 384
│ │ └─Transformer: 3-87 [60, 690, 192] (recursive)
│ │ └─Transformer: 3-88 [690, 60, 192] (recursive)
│ └─ModuleList: 2-7 -- 384
│ │ └─Transformer: 3-89 [60, 690, 192] (recursive)
│ │ └─Transformer: 3-90 [690, 60, 192] (recursive)
│ └─ModuleList: 2-8 -- 384
│ │ └─Transformer: 3-91 [60, 690, 192] (recursive)
│ │ └─Transformer: 3-92 [690, 60, 192] (recursive)
├─BandSplit: 1-2 [1, 690, 60, 192] --
│ └─ModuleList: 2 -- --
....
├─ModuleList: 1-1 -- --
│ └─ModuleList: 2-1 -- 384
│ │ └─Transformer: 3-77 [60, 690, 192] (recursive)
│ │ └─Transformer: 3-78 [690, 60, 192] (recursive)
│ └─ModuleList: 2-2 -- 384
│ │ └─Transformer: 3-79 [60, 690, 192] (recursive)
│ │ └─Transformer: 3-80 [690, 60, 192] (recursive)
│ └─ModuleList: 2-3 -- 384
│ │ └─Transformer: 3-81 [60, 690, 192] (recursive)
│ │ └─Transformer: 3-82 [690, 60, 192] (recursive)
│ └─ModuleList: 2-4 -- 384
│ │ └─Transformer: 3-83 [60, 690, 192] (recursive)
│ │ └─Transformer: 3-84 [690, 60, 192] (recursive)
│ └─ModuleList: 2-5 -- 384
│ │ └─Transformer: 3-85 [60, 690, 192] (recursive)
│ │ └─Transformer: 3-86 [690, 60, 192] (recursive)
│ └─ModuleList: 2-6 -- 384
│ │ └─Transformer: 3-87 [60, 690, 192] (recursive)
│ │ └─Transformer: 3-88 [690, 60, 192] (recursive)
│ └─ModuleList: 2-7 -- 384
│ │ └─Transformer: 3-89 [60, 690, 192] (recursive)
│ │ └─Transformer: 3-90 [690, 60, 192] (recursive)
│ └─ModuleList: 2-8 -- 384
│ │ └─Transformer: 3-91 [60, 690, 192] (recursive)
│ │ └─Transformer: 3-92 [690, 60, 192] (recursive)
├─ModuleList: 1 -- --
│ └─MaskEstimator: 2-9 [1, 690, 7916] --
==============================================================================================================
Total params: 69,102,468
Trainable params: 69,102,404
Non-trainable params: 64
Total mult-adds (G): 8.35
==============================================================================================================
Input size (MB): 2.82
Forward/backward pass size (MB): 703.40
Params size (MB): 232.17
Estimated Total Size (MB): 938.40
==============================================================================================================
From report I expect to have batch more than 48. But in the end I can use batch only 2.
GPU memory usage for batch = 2:
To follow the paper I must increase dim to 384, depth to 12 and decrease stft_hop_length to 441 - to be 10 ms. In this case batch size will be only 1 or not fit in memory )
Any ideas how to deal with such big memory usage?
Hi @lucidrains + @ZFTurbo
I work with birds and have a audio dataset the sounds they make that I would like to train with this code, I have had success with other models that were authored for music. Please can I request for a way to train this model to be added to the code? Sorry i can not accomplish this myself.
Originally posted by @lyndonlauder in #4 (comment)
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.