Giter Club home page Giter Club logo

syndiffix's People

Contributors

cristianberneanu avatar edongashi avatar pdobacz avatar yoid2000 avatar

Stargazers

 avatar  avatar

Watchers

 avatar  avatar  avatar

Forkers

yoid2000

syndiffix's Issues

Never include AID columns in output

It occurs to me that we should not include AID columns in the output.

The main reason is that the AID columns has no value, but the user might not know this.

In particular, we don't capture any event information, for instance the distribution of rows over AIDs, or inter-event timing or sequences. But if we include some kind of AID column, then the user might assume that we do capture this stuff, and get bad results.

Until we implement event information, it seems it would be cleaner and clearer just to exclude the AID columns in the output.

Implement synthetization pipeline.

Once we have all the individual stages available, we need a top-level module that assembles the full syndiffix pipeline:

  • select input columns
  • convert data
  • build forest
  • cluster columns
  • build trees
  • harvest buckets
  • generate microdata
  • stitch or patch columns
  • generate output

Clustering solutions are worse than in the F# implementation.

I was playing with a custom script that takes the first 50 K rows from the taxi-one-day.csv dataset and selects 5 specific columns for processing, and this is the output that I get:

Loaded 50000 rows. Columns:
0: pickup_longitude (float64)
1: pickup_latitude (float64)
2: fare_amount (float64)
3: rate_code (int64)
4: passenger_count (int64)

Fitting the synthesizer over the data...
Column clusters:
Initial= [2, 4]
Derived= (SHARED, [2], [0])
Derived= (SHARED, [2], [1])
Derived= (SHARED, [2], [3])

Notice how the initial cluster only has 2 columns in it.
The F# implementation produces this output:

=== Columns ===
  0 pickup_longitude (RealType); Entropy = 11.828231226413235
  1 pickup_latitude (RealType); Entropy = 11.799770332830937
  2 fare_amount (RealType); Entropy = 5.4025513988026965
  3 rate_code (IntegerType); Entropy = 0.2387340586221852
  4 passenger_count (IntegerType); Entropy = 0.6315105894728646
Assigning clusters...
Clusters: { InitialCluster = [|0; 1; 2|]
  DerivedClusters = [(Shared, [|2|], [|3|]); (Shared, [|2|], [|4|])] }.

Crash during stitching when processing constant data.

>>> Synthesizer(pandas.DataFrame(numpy.ones((2,200)))).sample()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Python310\lib\site-packages\syndiffix\synthesizer.py", line 99, in sample
    rows, root_combination = build_table(
  File "C:\Python310\lib\site-packages\syndiffix\clustering\stitching.py", line 381, in build_table
    acc = _stitch(materialize_tree, forest, metadata, acc, derived_cluster)
  File "C:\Python310\lib\site-packages\syndiffix\clustering\stitching.py", line 372, in _stitch
    return _do_stitch(forest, metadata, left, right, derived_cluster)
  File "C:\Python310\lib\site-packages\syndiffix\clustering\stitching.py", line 300, in _do_stitch
    raise ValueError(f"Empty sequence in cluster {right_combination}.")
ValueError: Empty sequence in cluster (0, 2).

Generating the salt

Do we have anything that configures the salt?

It seems to me that we should automatically create a good salt the first time syndiffix-py is run.

I see that the typical usage for obtaining the seed is like this:

noise = _generate_noise(anon_params.salt, "noise", noise_sd, (context.bucket_seed, aid_seed))

What we could do instead is to use a get_salt() routine instead of anon_params.salt, which always checks to see if the salt is set to something other than the default, and if not it sets it to a cryptographically strong value. Here is a library for that:

https://docs.python.org/3/library/secrets.html

Avoid unnecessary xxx*yyy style strings

Current implementation makes xxx*yyy type strings when the original string needs to be suppressed (where xxx is a possibly-null prefix, and yyy is a number.

The primary reason for this is to avoid releasing any strings that fail LCF. The problem is that we are too aggressive about this, and suppress strings that strictly speaking don't need to be suppressed. This happens because we partition a given columns values at 2dim and above.

What we should do instead is to use the 1dim tree to determine what the valid strings, and then at Ndim choose strings from the set of valid strings.

Clustering strategy interface

There are different ways to build clusters:

  • General purpose with dependence measure
  • Target column where we put the main column in every cluster.
  • Target column with ML feature selection, with or without patching
  • (Optional) Target column Univariate feature selection, with or without patching

Hitting a problem with bloom filter size

This error happens when running with an AID column. The specific case here is for the taxi table. The aid column is hack. I ran with 10000 rows. I tried several different data dataframes. Two had one column (med and passenger_count). And one had two columns (start and end datetime). All failed like this.

Fitting the synthesizer over the data...
c:\paul\GitHub\syndiffix-py\syndiffix\forest.py:62: FutureWarning: DataFrame.applymap has been deprecated. Use DataFrame.map instead.
  self.aid_data: npt.NDArray[np.uint64] = aids.applymap(hash_aid).to_numpy(Hash)
Traceback (most recent call last):
  File "C:\paul\GitHub\misc\python\syndiffix-py-play\testSynDiffixPy.py", line 120, in <module>
    tsd = testSynDiffixPy(df, csvFile, ['med','hack'], output_dir, aidsColumns=aidsColumns)
  File "C:\paul\GitHub\misc\python\syndiffix-py-play\testSynDiffixPy.py", line 55, in __init__
    synthesizer = Synthesizer(self.dfOrig, aids=aids)
  File "c:\paul\GitHub\syndiffix-py\syndiffix\synthesizer.py", line 46, in __init__
    self.forest = Forest(
  File "c:\paul\GitHub\syndiffix-py\syndiffix\forest.py", line 75, in __init__
    tree = self._build_tree(combination).push_down_1dim_root()
  File "c:\paul\GitHub\syndiffix-py\syndiffix\forest.py", line 120, in _build_tree
    tree = tree.add_row(0, RowId(index))
  File "c:\paul\GitHub\syndiffix-py\syndiffix\tree.py", line 221, in add_row
    self._create_child_leaf(child_index, row) if child is None else child.add_row(depth + 1, row)
  File "c:\paul\GitHub\syndiffix-py\syndiffix\tree.py", line 223, in add_row
    self.update_aids(row)
  File "c:\paul\GitHub\syndiffix-py\syndiffix\tree.py", line 50, in update_aids
    self.entity_counter.add(self.context.aid_data[row])
  File "c:\paul\GitHub\syndiffix-py\syndiffix\counters.py", line 43, in add
    self.aid_sets[i].add(aid)
  File "C:\Users\local_francis\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.10_qbz5n2kfra8p0\LocalCache\local-packages\Python310\site-packages\pybloom_live\pybloom.py", line 141, in add
    raise IndexError("BloomFilter is at capacity")
IndexError: BloomFilter is at capacity
PS C:\paul\GitHub\misc\python\syndiffix-py-play>

Make shortcut syntax for target column

Make a shortcut syntax like this:

df_synth = Synthesizer(df_orig, target_column='col1')

to accomplish this:

df_synth = Synthesizer(df_orig, clustering=MLClustering(target_column='col1'))

Dependence measure

There should be some existing library that supports dependence measure.
We might be able to use sci-kit's f-tests.

Metadata about original columns

I need to know whether a column is integral or real. The forest carries no such data. I added orig_data but that one is still converted to floats.

I might also need this for ML encoding of text columns with random shuffling.

Implement value-to-float conversions.

We need to provide various transformers for converting data values to floats.

For example, in order to:

  • convert datetimes to Unix timestamps and back
  • encode datetimes as year, month, day, etc.
  • encode strings as integers (in random or sorted order)
  • handle NULL values (drop rows, replace with average, replace with 2 * max, etc.)

Initial cluster sometimes only has a single column.

Here are some examples:

Loading data from `..\test\intrusion.csv`...
Loaded 494021 rows. Columns:
0: Unnamed: 0 (int64)
1: duration (int64)
2: protocol_type (object)
3: service (object)
4: flag (object)
5: src_bytes (int64)
6: dst_bytes (int64)
7: land (int64)
8: wrong_fragment (int64)
9: urgent (int64)
10: hot (int64)
11: num_failed_logins (int64)
12: logged_in (int64)
13: num_compromised (int64)
14: root_shell (int64)
15: su_attempted (int64)
16: num_root (int64)
17: num_file_creations (int64)
18: num_shells (int64)
19: num_access_files (int64)
20: is_host_login (int64)
21: is_guest_login (int64)
22: count (int64)
23: srv_count (int64)
24: serror_rate (int64)
25: srv_serror_rate (int64)
26: rerror_rate (int64)
27: srv_rerror_rate (int64)
28: same_srv_rate (int64)
29: diff_srv_rate (int64)
30: srv_diff_host_rate (int64)
31: dst_host_count (int64)
32: dst_host_srv_count (int64)
33: dst_host_same_srv_rate (int64)
34: dst_host_diff_srv_rate (int64)
35: dst_host_same_src_port_rate (int64)
36: dst_host_srv_diff_host_rate (int64)
37: dst_host_serror_rate (int64)
38: dst_host_srv_serror_rate (int64)
39: dst_host_rerror_rate (int64)
40: dst_host_srv_rerror_rate (int64)
41: label (object)

Fitting the synthesizer over the data...
Column clusters:
Initial= [9]
Derived= (SHARED, [9], [33, 4, 24, 26, 28])
Derived= (SHARED, [4], [7])
Derived= (SHARED, [4], [19])
Derived= (SHARED, [4], [15])
Derived= (SHARED, [4], [34])
Derived= (SHARED, [4], [17])
Derived= (SHARED, [33, 4, 28], [32, 1, 35, 38])
Derived= (SHARED, [4], [10])
Derived= (SHARED, [4], [20])
Derived= (SHARED, [32, 28, 4], [40, 2, 27, 31])
Derived= (SHARED, [4], [14])
Derived= (SHARED, [4], [36])
Derived= (SHARED, [1, 2, 35], [41, 3, 12, 6])
Derived= (SHARED, [4], [16])
Derived= (SHARED, [4], [11])
Derived= (SHARED, [26, 27, 4], [39])
Derived= (SHARED, [4], [29])
Derived= (SHARED, [33, 35, 28], [5, 22, 23])
Derived= (SHARED, [4], [21])
Derived= (SHARED, [22], [30])
Derived= (SHARED, [4], [8])
Derived= (SHARED, [32, 22], [0])
Derived= (SHARED, [24, 4, 38], [25, 37])
Derived= (SHARED, [4], [18])
Derived= (SHARED, [4], [13])
Loading data from `..\test\insurance.csv`...
Loaded 14000 rows. Columns:
0: GoodStudent (bool)
1: Age (object)
2: SocioEcon (object)
3: RiskAversion (object)
4: VehicleYear (object)
5: ThisCarDam (object)
6: RuggedAuto (object)
7: Accident (object)
8: MakeModel (object)
9: DrivQuality (object)
10: Mileage (object)
11: Antilock (bool)
12: DrivingSkill (object)
13: SeniorTrain (bool)
14: ThisCarCost (object)
15: Theft (bool)
16: CarValue (object)
17: HomeBase (object)
18: AntiTheft (bool)
19: PropCost (object)
20: OtherCarCost (object)
21: OtherCar (bool)
22: MedCost (object)
23: Cushioning (object)
24: Airbag (bool)
25: ILiCost (object)
26: DrivHist (object)

Fitting the synthesizer over the data...
Column clusters:
Initial= [15]
Derived= (SHARED, [15], [4, 14, 19, 20, 24])
Derived= (SHARED, [24, 4, 14], [16, 10])
Derived= (SHARED, [19, 20, 14], [9, 26, 5, 7])
Derived= (SHARED, [24, 16, 4], [11, 17, 18, 2])
Derived= (SHARED, [19, 14], [25])
Derived= (SHARED, [9, 18, 26], [1, 3, 13])
Derived= (SHARED, [1], [0])
Derived= (SHARED, [24, 17, 2], [8, 6, 23])
Derived= (SHARED, [9, 26, 7], [12])
Derived= (SHARED, [17, 2, 4], [21])
Derived= (SHARED, [19, 20, 14], [22])

Top level interface discussion

Please share how you envision to initiate synthesis from the top level.

What we need to be configurable:

  • Parameters
  • What columns to select
  • Column encoding
  • Clustering (univariate/focus/ML-k)
  • (Future) Number of rows
  • Other customizable behaviors

Excessive suppression

When running slurm, I notice a lot of files are almost completely suppressed in some columns. Example here is insurance.csv:

GoodStudent,Age,SocioEcon,RiskAversion,VehicleYear,ThisCarDam,RuggedAuto,Accident,MakeModel,DrivQuality,Mileage,Antilock,DrivingSkill,SeniorTrain,ThisCarCost,Theft,CarValue,HomeBase,AntiTheft,PropCost,OtherCarCost,OtherCar,MedCost,Cushioning,Airbag,ILiCost,DrivHist
False,*0,*0,*0,Current,*3,*1,*3,*1,*1,*2,False,*1,False,T*2,False,*3,*1,True,*1,T*3,True,*0,*0,True,T*3,*0
False,*0,*0,*0,Current,*0,*1,*2,*0,Poor,*0,False,*0,False,HundredThou,False,TwentyThou,*1,False,*0,T*3,False,*0,*0,True,T*2,*0
False,*0,*0,*0,Older,*2,Tank,*2,*0,*1,*2,False,*0,False,T*3,False,*2,*1,False,*0,T*3,False,*0,*0,True,T*3,*0
False,*0,*0,*0,Older,*2,*1,*2,*0,Poor,*1,False,SubStandard,False,HundredThou,False,TwentyThou,S*2,True,*1,T*2,True,*0,*0,False,T*2,*0
False,*0,*0,*0,Current,*0,*1,Severe,*1,*0,*0,False,*0,False,T*3,False,*2,*0,False,*0,T*3,False,*0,*0,True,T*2,*0
False,*0,*0,*0,Current,*0,*0,*2,*0,Poor,*0,False,SubStandard,False,HundredThou,False,TwentyThou,S*3,False,*0,HundredThou,False,*0,*0,True,T*2,*0
False,*0,*0,*0,Current,*1,*0,*0,*0,Poor,*0,False,SubStandard,False,T*3,False,TwentyThou,*0,True,*0,T*2,False,*0,*0,True,T*2,*0
False,*0,*0,*0,Current,None,*0,None,*0,Poor,*0,False,SubStandard,False,HundredThou,False,TwentyThou,S*2,True,*0,T*2,True,*0,*0,True,T*2,*0
False,*0,*0,*0,Current,*2,*1,*2,*1,Poor,*1,False,SubStandard,False,T*2,False,*1,*1,False,*1,T*2,False,*0,*0,False,T*2,*0
False,*0,*0,*0,Current,*0,*0,*1,*0,Poor,*0,False,SubStandard,False,HundredThou,False,TwentyThou,*0,False,*1,T*3,True,*0,*0,True,T*2,*0
False,*0,*0,*0,Current,*2,Tank,Severe,*2,*0,*2,False,*0,False,Thousand,False,*1,*0,False,*1,HundredThou,True,*0,*0,True,T*3,*0
False,*0,*0,*0,Current,*2,*1,*2,*0,Poor,*0,False,SubStandard,False,HundredThou,False,TwentyThou,*1,False,*0,T*2,False,*0,*0,True,T*2,*0
...

We don't detect datetimes properly

File is expedia_hotel_logs, column date_time, I saw this result:

         log_id      date_time  ...  hotel_market  hotel_cluster
0  LOG_0076*532    2014-01-*86  ...            29             25
1    LOG_000*16     2014-0*393  ...           366             22
2   LOG_008*587  2014-08-1*424  ...           191             25
3   LOG_007*525  2014-07-1*265  ...           633             70
4   LOG_007*499  2014-07-1*291  ...            24             15

Suppression indicates it's parsed as string.

Support more Python versions.

Because of scipy, we are limited to supporting Python versions >= 3.10 and < 3.13.
We should fix dependencies to allow any version of Python >= 3.10.

add scaffolding for jekyl

Not sure what all should be here, but at least a simple home page with a few tabs like "home", "contact", "download"

list index out of range when building microdata

Model sdx_py
Using source file census.csv
Model sdx_py for dataset /INS/syndiffix/work/edon/results/sdx_py/csv/train/census.csv, focus column detailed household summary in household
Training dataframe shape (before features) (209499, 41)
['age', 'class of worker', 'detailed industry recode', 'detailed occupation recode', 'education', 'wage per hour', 'enroll in edu inst last wk', 'marital stat', 'major industry code', 'major occupation code', 'race', 'hispanic origin', 'sex', 'member of a labor union', 'reason for unemployment', 'full or part time employment stat', 'capital gains', 'capital losses', 'dividends from stocks', 'tax filer stat', 'region of previous residence', 'state of previous residence', 'detailed household and family stat', 'detailed household summary in household', 'migration code-change in msa', 'migration code-change in reg', 'migration code-move within reg', 'live in this house 1 year ago', 'migration prev res in sunbelt', 'num persons worked for employer', 'family members under 18', 'country of birth father', 'country of birth mother', 'country of birth self', 'citizenship', 'own business or self employed', "fill inc questionnaire for veteran's admin", 'veterans benefits', 'weeks worked in year', 'year', 'label']
Columns ['age', 'class of worker', 'detailed industry recode', 'detailed occupation recode', 'education', 'wage per hour', 'enroll in edu inst last wk', 'marital stat', 'major industry code', 'major occupation code', 'race', 'hispanic origin', 'sex', 'member of a labor union', 'reason for unemployment', 'full or part time employment stat', 'capital gains', 'capital losses', 'dividends from stocks', 'tax filer stat', 'region of previous residence', 'state of previous residence', 'detailed household and family stat', 'detailed household summary in household', 'migration code-change in msa', 'migration code-change in reg', 'migration code-move within reg', 'live in this house 1 year ago', 'migration prev res in sunbelt', 'num persons worked for employer', 'family members under 18', 'country of birth father', 'country of birth mother', 'country of birth self', 'citizenship', 'own business or self employed', "fill inc questionnaire for veteran's admin", 'veterans benefits', 'weeks worked in year', 'year', 'label']
Running with ML target detailed household summary in household...
Column clusters:
Initial= [6, 7, 9, 10, 11, 12, 23]
Derived= (SHARED, [23], [13, 14, 15, 19, 20, 24])
Derived= (SHARED, [23], [25, 26, 27, 28, 29])
Derived= (SHARED, [23], [0, 37, 39, 40, 30])
Derived= (SHARED, [23], [2, 3, 4, 5])
Derived= (SHARED, [23], [8, 16, 17, 18, 21])
Derived= (SHARED, [23], [32, 33, 38, 22, 31])
Derived= (SHARED, [], [1])
Derived= (SHARED, [], [34])
Derived= (SHARED, [], [35])
Derived= (SHARED, [], [36])
Traceback (most recent call last):
  File "/INS/syndiffix/work/edon/test-syndiffix/test_syndiffix/oneModel.py", line 598, in <module>
    fire.Fire(oneModel)
  File "/home/egashi/.asdf/installs/python/3.10.13/lib/python3.10/site-packages/fire/core.py", line 141, in Fire
    component_trace = _Fire(component, args, parsed_flag_args, context, name)
  File "/home/egashi/.asdf/installs/python/3.10.13/lib/python3.10/site-packages/fire/core.py", line 475, in _Fire
    component, remaining_args = _CallAndUpdateTrace(
  File "/home/egashi/.asdf/installs/python/3.10.13/lib/python3.10/site-packages/fire/core.py", line 691, in _CallAndUpdateTrace
    component = fn(*varargs, **kwargs)
  File "/INS/syndiffix/work/edon/test-syndiffix/test_syndiffix/oneModel.py", line 530, in oneModel
    runSynDiffix(df, outPath, focusColumn, doPatches, testData, job)
  File "/INS/syndiffix/work/edon/test-syndiffix/test_syndiffix/oneModel.py", line 136, in runSynDiffix
    synData = synthesizer.sample()
  File "/INS/syndiffix/work/edon/syndiffix-py/syndiffix/synthesizer.py", line 70, in sample
    rows, root_combination = build_table(
  File "/INS/syndiffix/work/edon/syndiffix-py/syndiffix/clustering/stitching.py", line 377, in build_table
    acc = materialize_tree(forest, clusters.initial_cluster)
  File "/INS/syndiffix/work/edon/syndiffix-py/syndiffix/synthesizer.py", line 62, in materialize_tree
    generate_microdata(
  File "/INS/syndiffix/work/edon/syndiffix-py/syndiffix/microdata.py", line 198, in generate_microdata
    microdata_rows.extend(
  File "/INS/syndiffix/work/edon/syndiffix-py/syndiffix/microdata.py", line 148, in _microdata_row_generator
    yield [_generate(i, c, nm) for i, c, nm in zip(intervals, convertors, null_mappings)]
  File "/INS/syndiffix/work/edon/syndiffix-py/syndiffix/microdata.py", line 148, in <listcomp>
    yield [_generate(i, c, nm) for i, c, nm in zip(intervals, convertors, null_mappings)]
  File "/INS/syndiffix/work/edon/syndiffix-py/syndiffix/microdata.py", line 139, in _generate
    return convertor.from_interval(interval) if interval.min != null_mapping else (None, null_mapping)
  File "/INS/syndiffix/work/edon/syndiffix-py/syndiffix/microdata.py", line 122, in from_interval
    return self._map_interval(interval)
  File "/INS/syndiffix/work/edon/syndiffix-py/syndiffix/microdata.py", line 127, in _map_interval
    min_value = self.value_map[int(interval.min)]
IndexError: list index out of range

I think it somehow ended up with an empty set of strings.

Object of type Timestamp is not JSON serializable

More a test-syndiffix issue. Looks like datasets with datetimes fail to produce output.

Traceback (most recent call last):
  File "/INS/syndiffix/work/edon/test-syndiffix/test_syndiffix/oneModel.py", line 599, in <module>
    fire.Fire(oneModel)
  File "/home/egashi/.asdf/installs/python/3.10.13/lib/python3.10/site-packages/fire/core.py", line 141, in Fire
    component_trace = _Fire(component, args, parsed_flag_args, context, name)
  File "/home/egashi/.asdf/installs/python/3.10.13/lib/python3.10/site-packages/fire/core.py", line 475, in _Fire
    component, remaining_args = _CallAndUpdateTrace(
  File "/home/egashi/.asdf/installs/python/3.10.13/lib/python3.10/site-packages/fire/core.py", line 691, in _CallAndUpdateTrace
    component = fn(*varargs, **kwargs)
  File "/INS/syndiffix/work/edon/test-syndiffix/test_syndiffix/oneModel.py", line 531, in oneModel
    runSynDiffix(df, outPath, focusColumn, doPatches, testData, job)
  File "/INS/syndiffix/work/edon/test-syndiffix/test_syndiffix/oneModel.py", line 155, in runSynDiffix
    json.dump(outJson, f, indent=4)
  File "/home/egashi/.asdf/installs/python/3.10.13/lib/python3.10/json/__init__.py", line 179, in dump
    for chunk in iterable:
  File "/home/egashi/.asdf/installs/python/3.10.13/lib/python3.10/json/encoder.py", line 431, in _iterencode
    yield from _iterencode_dict(o, _current_indent_level)
  File "/home/egashi/.asdf/installs/python/3.10.13/lib/python3.10/json/encoder.py", line 405, in _iterencode_dict
    yield from chunks
  File "/home/egashi/.asdf/installs/python/3.10.13/lib/python3.10/json/encoder.py", line 325, in _iterencode_list
    yield from chunks
  File "/home/egashi/.asdf/installs/python/3.10.13/lib/python3.10/json/encoder.py", line 325, in _iterencode_list
    yield from chunks
  File "/home/egashi/.asdf/installs/python/3.10.13/lib/python3.10/json/encoder.py", line 438, in _iterencode
    o = _default(o)
  File "/home/egashi/.asdf/installs/python/3.10.13/lib/python3.10/json/encoder.py", line 179, in default
    raise TypeError(f'Object of type {o.__class__.__name__} '
TypeError: Object of type Timestamp is not JSON serializable

Testing checklist for syndiffix compliance

  • Forest tests
  • Dependence measure tests
  • Entropy measure tests
  • Stitching tests
  • Harvesting tests
  • Microdata tests

Stitching and harvesting rely on random state and we have different RNG algos across implementations. Will be more tricky to test...

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.