amazon-archives / amazon-dsstne Goto Github PK

Deep Scalable Sparse Tensor Network Engine (DSSTNE) is an Amazon developed library for building Deep Learning (DL) machine learning (ML) models

License: Apache License 2.0

C 2.99% C++ 48.00% Cuda 41.05% Python 0.88% CMake 0.18% Makefile 0.80% Dockerfile 0.07% Java 6.04%

amazon-dsstne's Introduction

Amazon DSSTNE: Deep Scalable Sparse Tensor Network Engine

DSSTNE (pronounced "Destiny") is an open source software library for training and deploying recommendation models with sparse inputs, fully connected hidden layers, and sparse outputs. Models with weight matrices that are too large for a single GPU can still be trained on a single host. DSSTNE has been used at Amazon to generate personalized product recommendations for our customers at Amazon's scale. It is designed for production deployment of real-world applications which need to emphasize speed and scale over experimental flexibility.

DSSTNE was built with a number of features for production recommendation workloads:

Multi-GPU Scale: Training and prediction both scale out to use multiple GPUs, spreading out computation and storage in a model-parallel fashion for each layer.
Large Layers: Model-parallel scaling enables larger networks than are possible with a single GPU.
Sparse Data: DSSTNE is optimized for fast performance on sparse datasets, common in recommendation problems. Custom GPU kernels perform sparse computation on the GPU, without filling in lots of zeroes.

Benchmarks

scottlegrand@ reported [near-linear scaling with multiple GPUs] on the MovieLens recommendation problem (https://medium.com/@scottlegrand/first-dsstne-benchmarks-tldr-almost-15x-faster-than-tensorflow-393dbeb80c0f#.ghe74fu1q)
Directions on how to run a benchmark can be found in here

Scaling up

Using Spark in AWS EMR and Dockers in AWS ECS

License

Setup

Follow Setup for step by step instructions on installing and setting up DSSTNE

User Guide

Check User Guide for detailed information about the features in DSSTNE

Examples

Check Examples to start trying your first Neural Network Modeling using DSSTNE

Q&A

FAQ

amazon-dsstne's People

Contributors

Stargazers

Watchers

Forkers

phvu jayantsahewal vijaimatamazon rtvt123 onlydole brandturner codeaudit franciscogodoy jorellano kp666 machinelearningjourney deepesch vishwakarmarahul kevinmel2000 kahowu hak23 traveler817 luckytina haoshuji wanjinchang grail lixiaosi33 geilove hitluobin nolaan snazz2001 fairymane nimmen helma-t skypea paulhendricks adomore chongfeng shiranian neoneye xhuvom famousdeadguy tribemedia tomasantony sysman-one tspannhw tfwu kimfriishansen michaelhappycheng sarvagnya1 schevalier mantrasuser yhhuang-tw flowgrad dikku txd888 furyphoenix maramy jayinai tigerjim 2php xuan9719 goodsensejp zhang5555 zuiwanting lesaffrea jiajiadf fboldo devsunny 781155640 sudanenator yutarochan cxysteven nikoma owenb132 wgapl rlugojr wuntoguo szhaoyu andy-amoy shubhamtotala olachan tisma tkaneda badlogicmanpreet chagge fatty- xzm2004260 learn-deeplearn msoftware h4rikris ikrishneel saiaman aozturk latishchalla arnabghosh albert-lzg merritos toiyeuvietnam1986 bcmundim yk-nalabo sunilkgrao gopijey seanlhlee selimam

amazon-dsstne's Issues

config in benchmark example

"Layers" : [
{ "Name" : "Input", "Kind" : "Input", "N" : "auto", "DataSet" : "gl_input", "Sparse" : true },
{ "Name" : "Hidden1", "Kind" : "Hidden", "Type" : "FullyConnected", "Source" : "Input", "N" : 1024, "Activation" : "Sigmoid", "Sparse" : false, "pDropout" : 0.5, "WeightInit" : { "Scheme" : "Gaussian", "Scale" : 0.01 } },
{ "Name" : "Hidden2", "Kind" : "Hidden", "Type" : "FullyConnected", "Source" : ["Hidden1"], "N" : 1024, "Activation" : "Sigmoid", "Sparse" : false, "pDropout" : 0.5, "WeightInit" : { "Scheme" : "Gaussian", "Scale" : 0.01 } },
{ "Name" : "Hidden3", "Kind" : "Hidden", "Type" : "FullyConnected", "Source" : ["Hidden2"], "N" : 1024, "Activation" : "Sigmoid", "Sparse" : false, "pDropout" : 0.5, "WeightInit" : { "Scheme" : "Gaussian", "Scale" : 0.01 } },
{ "Name" : "Output", "Kind" : "Output", "Type" : "FullyConnected", "DataSet" : "gl_output", "N" : "auto", "Activation" : "Sigmoid", "Sparse" : true , "WeightInit" : { "Scheme" : "Gaussian", "Scale" : 0.01, "Bias" : -10.2 }}
],

There are five layers totally. The first layer Sparse is true, and the output layer Sparse is true.
I am wondering why the output data is sparse.
For the each userid, I guess the output is vector(27278 dimensions), of the score on each movie and the score is float.
So, it seems the output data is dense if I get scores of all movies.
Could you give me some advice on this?
Thank you!

Any support for reservoir computing algorithms?

I understand that DSSTNE does support sparsely connected networks but I really want to use this framework for implementing reservoir computing where there is randomized connections among the nodes in the hidden layers and connections to input and output layers.
Thanks in advance.

mpiCC: No such file or directory?

I get this error when i attemp to build the code


************  RELEASE mode ************
mpiCC -DOMPI_SKIP_MPICXX -std=c++0x -O3 -I/usr/local/cuda/include -IB40C -IB40C/KernelCommon -I/usr/local/include -I/usr/local/openmpi/include -I/usr/include/jsoncpp -I../utils -I../engine -c NNTypes.cpp
make[1]: mpiCC: No such file or directory
make[1]: *** [NNTypes.o] Error 1
make: *** [lib/libdsstne.a] Error 2

Any ideas?

Issue when follow the example

Hello all,

I am following the example with movielens and got the error

generateNetCDF -d gl_input -i ml20m-all -o gl_input.nc -f features_input -s samples_input -c
generateNetCDF: error while loading shared libraries: libnetcdf_c++4.so.1: cannot open shared object file: No such file or directory

I installed all the libraries required in the homepage (I am running Ubuntu 14.04 with GPU CUDA). Could you suggest some hints to figure out what did I do wrong?

Thanks a lot,

AWS Machine Learning and DSSTNE

First of all, apologies if this is the wrong place to post this question. I would appreciate it if you could point me to the correct forum.

I am new to ML and NNs and i'm currently researching what's available out there so please excuse my ignorance. I got a couple of questions that i would like to ask the community.

I was wondering what the difference is between AWS Machine Learning (https://aws.amazon.com/machine-learning/) and DSSTNE. Does AWS ML use DSSTNE in the background? Are they completely different technologies?

Google's Parsey McParseface uses Tensorflow behind the scenes. Is DSSTNE what Amazon's Alexa uses?

Theoretically, can i integrate ParseyMcParseface/Alexa(AVS) with DSSTNE?

Thank you very much

Math behind dsstne

Hi guys,

Thank you for releasing your project to open-source.
Can you please explain a math behing dsstne?

Is it a collaborative filtering with explicit/implicit feedback?
You don't use ratings from MovieLens, so I assume that implict...
A link to a paper/doc will be very userfull.

gtx1080 kScaleAndBias_kernel error

Dear authors,

When I try to run the sample using GTX 980 card, it works properly. However, when I specified the program to GTX 1080 card on the same machine, an error was occurred as follows:

NNNetwork::NNNetwork: 1 input layer
NNNetwork::NNNetwork: 1 output layer
NNWeight::NNWeight: Allocating 115343360 bytes (2048, 14080) for weights between Input and Hidden1
Error: invalid device function launching kernel kScaleAndBias_kernel
GpuContext::Shutdown: Shutting down cuBLAS on GPU for process 0
GpuContext::Shutdown: CuBLAS shut down on GPU for process 0
GpuContext::Shutdown: Shutting down cuRand on GPU for process 0
GpuContext::Shutdown: CuRand shut down on GPU for process 0
GpuContext::Shutdown: Process 0 out of 1 finalized.

Could you help to find what it is the problem? Thank you!

Best regards,
Shaohuai

Non-empty .bash_history present on AMI

The AMI for Amazon DSSTNE (ami-d6f2e6bc) includes a non-empty .bash_history file.

As is turns out, there's nothing sensitive in it, but this is in contradiction with best practices :) http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/building-shared-amis.html

Correctly handle analog and digital inputs in utility/wrapper application

DSSTNE's utility/wrapper do not currently distinguish between analog and digital predictors, and the current behavior is poorly documented. An example of confusion this causes can be seen in issue #66.

Improved documentation will definitely help here!

I've created this issue to identify the work items required to implement support for analog and digital input across the 'generateNetCDF', 'predict' and 'train' utilities. And where it makes sense to do so, we can update some of the documentation while investigating.

Comand Definition Language

Hello,

we are trying to test DSSTNE on our data and were experimenting with various NN settings.
According to your CDL file,
https://github.com/amznlabs/amazon-dsstne/blob/master/docs/getting_started/CDL.txt
it should be possible to change optimization algorithm or to set other training parameters. We added the configuration as part of NN json configuration file but it did not work.

We were getting error "LoadNeuralNetworkJSON: Unknown neural network field: trainingparameters"

Basically we just added block like this to NN json config:
"TrainingParameters": {
"epochs": 40,
"alpha": 0.01,
"optimizer": "Nesterov",
"lambda": 0.001
}

What is the right way to set up training parameters such as optimizer?

Thanks,

Michal

Trouble with AWS GPU instance and Docker

Hello,

I made an AWS GPU instance from the ami called "amazon-dsstne" (ami-7a0df81a) and made docker image of dsstne there. I followed the example in the document. Then when running "train", it fails with the following error message:

modprobe: ERROR: ../libkmod/libkmod.c:556 kmod_search_moddep() could not open moddep file '/lib/modules/3.13.0-83-generic/modules.dep.bin'
cudaGetDeviceCount failed unknown error

Thanks.

Fail to build on the AMI.

After this commit, I can't build DSSTNE on the AMI. Or setup instruction doesn't work well.

The following is the error message.
'GpuTypes.h:23:19: fatal error: cudnn.h: No such file or directory'

The sample doesn't seem to trigger the sparse path

It looks the default ml20-all sample doesn't call sparse kernels. All I see is cublasSgemm.
Log says "NNDataSet::CalculateSparseDatapointCounts: Maximum sparse datapoints (9254) per example in dataset gl_input too large for fast sparse kernels."

Does that mean, given the sparsity of this case, dense sgemm still out perform the sparse ones?

Correctly handle or reject duplicate feature index IDs in 'generateNetCDF' and 'predict' utilities

When parsing a feature/sample index files, the 'generateNetCDF' and 'predict' utilities do not detect duplicate IDs. Failing to detect duplicate IDs allows a invalid index files to be loaded, which can eventually lead to issues such as segmentation faults (see issue #62 for an example).

Although this is related to issue #64 which seeks to resolve the issue at the time that the data is generated, fixing the issue here may prevent it from coming up in different ways.

Low ranking accuracy of the example with MovieLens20M?

Hi,

I've been playing around today with DSSTNE with the goal of running the example with MovieLens20M and compare the NN in the example with some state-of-the-art CF algorithms that I have implemented here. From my evaluation (which is by no means exhaustive or perfect) the example provided by DSSTNE does not seem to be competitive with respect to state of the art CF algorithms.

To summarise, I have downloaded the original MovieLens 20M dataset and I have performed a random 80%-20% partition. I have transformed the training subset to the DSSTNE format, with the only difference that I do not include the timestamps of the dataset, but 1's for all movies (is this actually very important??). I have generated recommendations with my CF algorithms (popularity, user-based and matrix factorisation) and, following the steps in the example, the predictions of DSSTNE. Finally, I have evaluated the performance with the testing subset using precision at cutoff 10.

These are the results, the configuration provided in your example does not seem to work very well:
pop 0.10974162112149495
ub 0.24097987334078072
mf 0.25135912784469483
dsstne 0.056956854920365056

I am no expert in ANN's so I cannot figure out easily whether I should modify the parameters in the config.json provided in the example to make it work better. Have you compared the performance of the example with similar CF algorithms? If so, could you please share some results/insights?

Cheers
Saúl

how predict works

110510 26743
121019 26740
121017 26739
106401 26736
104307 26734
103010 26733
71300 26732
127445 26730
120839 26729
123188 26725

this is the features_input
i guess the first col is movie id, the second col is index of features.

for ml-20m dataset, there are 138493 users, which means the size of input data is 138493
what is the feature of each input? could you give me an example?
i guess it's a vector, dimension is 26744; each dimension stands for a movie id
if user like this movie, the value will be 1; otherwise 0

the network has 3 layers
the input layer size N is auto, so the N should be 26744?
the hidden layer size N is 128, the output layer size N is auto(26744).

and if the network is trained, how predict works?

-l layer: (default = Output) the network layer to use for predictions

and i make two experiments,

the first do not use -l
predict -b 1024 -d gl -i features_input -o features_output -k 10 -n gl.nc -f ml20m-all -s recs -r ml20m-all

second: use -l Hidden
predict -b 1024 -d gl -i features_input -o features_output -l Hidden -k 10 -n gl.nc -f ml20m-all -s recs -r ml20m-all

the prediction result is the same, it seems "-l" does not work.
And i check source code, in /src/amazon/utils/Predict.cpp, I could not find getOptionalArgValue of "-l"

and if i set -k to 1000, it will reminds
Error :Optimized topk Only works for top 128

what predict really take for prediction, output layer's output(26744dimension)?
and each value is a float between 0-1
am i right? Could you please give some papers about how this example works?

Thank you

GPU memory consumption for large example sets

I have tried to train an autoencoder similar to the movielens example with a couple of million sparse training examples and ran into GPU memory allocation errors. After reducing the number of examples the network could be trained just fine. After taking a look at the code, it seems that the whole training data is saved on the GPU for the input and output layers (this seems to happen in the NNDataSet::Shard function).
Is there any way to get around this limitation? In theory, shouldn't it be enough to upload only the data for the current mini-batch?

Dataset Question

Inside the ml20m-all file, it looks like this

2,1112486027:29,1112484676:32,1112484819:47,1112484727:50,1112484580:112,1094785740:151,1094785734:223,1112485573:253,1112484940:260,1112484826:293,1112484703:296,1112484767:318,1112484798:337,1094785709:367,1112485980:541,1112484603:589,1112485557:593,1112484661:653,1094785691:919,1094785621:924,

This looks like a listing of movies. Each movie has a list of features. But what are the features? This is only raw values, i'd like to know what these values mean.

Also, is the example neural network doing collaborative filtering or content based filtering?

'train' and 'predict' commands?

I successfully spun up a GPU instance on EC2 using AMI you guys mentioned and uploaded this repo to it. I tried running the sample via run_movie_lens_sample.sh but it can't find the 'train' or 'predict' commands. How do i install these commands?

Remove unnecessary null pointer checks

Extra null pointer checks are not needed in the function "NNLayer::Deallocate".

documentation clarification

Your docs say
Hidden Hidden Layers are Layers which connect between layers. It Does require a DataSet but rather a Source. If Source is not mentioned then the previous Layer is taken as Source

I think it might be changed to
Hidden Hidden Layers are Layers which connect between layers. It does not require a DataSet but rather a Source. If Source is not mentioned then the previous Layer is taken as Source

Train problem in AWS AMI

Hello all,
I am using the AWS instance for the setup, i have a problem when running the training part:

train -c config.json -i gl_input.nc -o gl_output.nc -n gl.nc -b 256 -e 10
Train will use configuration file: config.json
Train will use input data file: gl_input.nc
Train will use output data file: gl_output.nc
Train will produce networkFileName: gl.nc
Train will use batchSize: 256
Train will use number of epochs: 10
Train alpha 0.025, lambda 0.0001, mu 0.5.Please check CDL.txt for meanings
GpuContext::Startup: Process 0 out of 1 initialized.
modprobe: ERROR: could not insert 'nvidia': No such device
cudaGetDeviceCount failed no CUDA-capable device is detected

However, there are several versions of nvidia ...what can be the issue?
Thanks in advance!

collaborative filtering with neural networks? how is this implemented.

Hi fellows,

Thanks for the great work. I successfully applied this tool to a similar product recommendation task.

As for the example given, I am trying to understand how the three-layer neural network is used to tackle the MovieLens recommendation problem, which is a collaborative filtering task where amount of time users spend watching each movie is taken as an implicit rating. A common way of addressing this problem is matrix factorization to fill the blanks.

I have used neural networks for regression/classification tasks. However, it is not clear to me how movieLens problem can be addressed with neural networks. Is this based on some approach proposed in literature?

Another question is if dsstne can be applied for solving more generic ML tasks, such as a regression problem. Based on the example given, it seems like the existing predict API is only suitable for collaborative filtering type problems.

cheers,

How to improve the ranking rate?

Hi All,

I trained the network using blow data,

11032 4153892,65002142:3462821,63988456:2213877,64982398:2213877,65009021:2213877,65272075:4040029,65002223:...
Note: 11032 is user_id, 4153892 is sku_no, 65002142 is order_no

but got very low ranking rate, could you have any suggestions?

11032 4431085,0.014:3970704,0.014:4231712,0.013:3941914,0.013:4093510,0.012:4074152,0.012:4431424,0.012:4258254,0.012:3968476,0.011:3968475,0.011:

Thanks

Segmentation fault

Hi,

I completed the training then performed the predicting, but got below exception:
Do you have any suggestions?

BTW, just one neural network got this error, others is OK.

=========== exception messages =========
Exported gl_input_predict.samplesIndex with 65075 entries.
Raw max index is: 65064
Rounded up max index to: 65152
Created NetCDF file gl_input_predict.nc for dataset gl_input
Number of network input nodes: 65064
Number of entries to generate predictions for: 65075
LoadNetCDF: Loading UInt data set
NNDataSet::NNDataSet: Name of data set: gl_input
NNDataSet::NNDataSet: Attributes: Sparse Boolean
NNDataSet::NNDataSet: 1-dimensional data comprised of (65152, 1, 1) datapoints.
NNDataSet::NNDataSet: 3778407 total datapoints.
NNDataSet::NNDataSet: 65075 examples.
[snx-dsstne:02608] *** Process received signal ***
[snx-dsstne:02608] Signal: Segmentation fault (11)
[snx-dsstne:02608] Signal code: Address not mapped (1)
[snx-dsstne:02608] Failing at address: 0xb3a1840
[snx-dsstne:02608] [ 0] /lib/x86_64-linux-gnu/libpthread.so.0(+0x10330)[0x7f02cc834330]
[snx-dsstne:02608] [ 1] predict[0x430d26]
[snx-dsstne:02608] [ 2] predict[0x453fa0]
[snx-dsstne:02608] [ 3] predict[0x42a87b]
[snx-dsstne:02608] [ 4] predict[0x408307]
[snx-dsstne:02608] [ 5] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf5)[0x7f02cc480f45]
[snx-dsstne:02608] [ 6] predict[0x40aab1]
[snx-dsstne:02608] *** End of error message ***
Segmentation fault (core dumped)

any "convolutional" support yet?

Hi,
I have a network with sparse 1d input vectors and want first layer to be convolutional. Can I do that with dsstne?
The api does mention "Convolutional" but the documentation mentions it's not supported for "image", while my input is not image.

Would you please clarify?
Thank you.
Ameen.

How to continue learning?

Hi,

I have used the 'train -n gl.nc ...' to get a network file. In the future, if I have more new sample data can be used to training. How to continue training? I used the command 'train -n gl.cn ..' again, but got below error:

Error: Network file already exists: gl.nc

Thanks.

Machine Learning

Currently we are looking for Recommendation Engine suitable for online marketing .We came across AWS Machine Learning where we have to submit our input file containing user item relation along with its targeted to generate a model .Mortar Recommendation Engine which is not currently available interacts with AWS and generate target value which can be passed to the Machine Learning Process and Prediction will be available ...In DSSTNE how is it working ..Im totally confused.

Typo

On the example description second line :
"will go through the 3 basics steps and walk you ~~though~~ through the wrappers"

Evaluate new topK function

The code promises to use SGD as the default training mode, but then doesn't.

To quote lines 138-140 in Train.cpp

// Set to default training mode SGD.
TrainingMode mode=Nesterov;
pNetwork->SetTrainingMode(mode);

Elsewhere in the help files it also says that the default is SGD. And after looking through the code, I can't figure out how the user can actually set the optimization mode. Before you reply "just put it in the .config file", that doesn't work - see #39

The default should either be changed to SGD, or the documentation should change. And it should be clearer how a user can specify the training optimization mode.

Catch exceptions in main()?

I expect that exception handling is usually supported by a C++ program. I wonder why your function "main" does not contain corresponding try and catch instructions so far.

How do you think about recommendations by Matthew Wilson in an article?

Would you like to adjust the implementation if you consider effects for uncaught/unhandled exceptions like they are described by Danny Kalev?

NetCDF not working in AWS AMI

Hi, I tried using DSSTNE by creating an instance with AMI Amazon DSSTNE (ami-d6f2e6bc). I am not able to run the command 'generateNetCDF -d gl_input -i ml20m-all -o gl_input.nc -f features_input -s samples_input -c' giving error 'generateNetCDF: command not found'. Do I need to install and configure NetCDF in the AMI before using it?

Thanks
-Hari

Expand to multiple instances?

Is there a way to run this on multiple instances working on the same data set? like EMR?

Questions about Dataset

Hi,
I was playing with the sample data, and now I have 3 questions.

Q1. How to make dataset with multiple feature values?
Currently, only one feature has one feature value. Is it possible to a feature has multiple values? If so how can I do that?

Q2. Changing all timestamps to 1 manually giving me a different result.
ml20m-all is the dataset of userId and movieId with timestamp.

userId movieId,timestamp: movieId,timestamp: movieId,timestamp…

On Issue#21, Mr.Rejith said “Currently no movie features are taken. Currently only 1/0 signals are supported from the wrapper script even though the Engine supports analog signals.”
So I changed all timestamps in ml20m-all to 1, and ran DSSTNE with modified data.
eg) 2,1112486027:29,1112484676:32,1112484819 to 2,1:29,1:32,1
I thought results would be the same, but it was not.
I am guessing that DSSTNE treats feature value as continuous value. Is this right? Then why did DSSTNE give me a different result?

Q3. Does DSSTNE support digital inputs?
On Issue#11, Mr.Rejith said “DSSTNE Engine supports analog inputs but we have not exposed it in the wrapper . if the Rating comes it could be viewed as an analog signals”
Analog inputs like Rating are continuous value, so I wondered if DSSTNE supports digital inputs like category id which is discrete value.

DSSTNE is wonderful. I feel like it has so much potential.
But I couldn’t figure how to use it well, and I couldn’t find detailed documentations online.

Thank you,
yuasa

Fix data split for multi-gpu

Sparse indexes of data are splitted correctly between gpus, but data are copied as if it was a single gpu. It is a bug.

Improve detection of malformed data point tuples in 'generateNetCDF' utility

Validation of data point tuples in 'generateNetCDF' is currently very limited. For example, it will not detect a data point tuple that contains unexpected characters that would typically indicate data corruption. See issue #62 for an example.

It would be worth looking at how we can implement more robust validation while minimally impacting the speed of the parser. This is also be an opportunity to review the parser code at a higher level to identify any other optimisations that might apply.

The parser functionality for generateNetCDF is currently implemented via helper functions in NetCDFHelper.cpp.

Improved isolation of GPU and non-GPU functionality in utils and headers

I have been working on some more unit tests for DSSTNE. One of the issues that I have run into is how to structure the test suite so that parts of the code that do not depend on CUDA could be executed in a CI environment such as Travis CI.

The approach I'm exploring at the moment is to make more aggressive use of forward declarations in the header files, such that the test suite could be built without a full CUDA installation. I wanted to raise an issue to get the DSSTNE team's thoughts before diving too deep, or submitting a PR.

I've also noticed a few other things that could be improved, such as moving 'using namespace x' declarations from header files to source files. These are all minor changes, but will help with isolation between 'utils' and 'engine'.

Can we use dsstne(or similar functions) in the AWS Machine Learning

Hi,

As it said in title, we hope using the AWS Machine Learning to do product recommendations(top-N, and base on deep learning). But we found the ML just be used to predict whether a customer will buy a product.

Thanks,
Andy.

Compilation Fail on Ubuntu 16.04 64 bit

r/lib/gcc/x86_64-linux-gnu/5/include/mwaitxintrin.h(36): error: identifier "__builtin_ia32_monitorx" is undefined

/usr/lib/gcc/x86_64-linux-gnu/5/include/mwaitxintrin.h(42): error: identifier "__builtin_ia32_mwaitx" is undefined

Getting this compilation error while compiling destiny on Ubuntu 16.04 64 bit

Classification versus Recommendation

I followed your tutorial and wanted to apply dsstne to a different project. It seems that the only output of the predict method is one that generates recommendations with the trained net and I want classification output.

I tried training a feedforward network with the output layer being classification data to all my training instances, but the output it generated doesn't seem right. I am hoping there is a predict method call for this situation.

Is there a way to produce classification output from a trained network?
Is there a max number of features for a training instance? (I tried 20,000 initially but got a 'std::bad_alloc' error. 10,000 produced no error)

What does the denoising parameter do?

To the extent that documentation covers it (https://github.com/amznlabs/amazon-dsstne/blob/master/docs/getting_started/LDL.txt#L23) it's far from clear. Some operation is performed on input events with some probability p?

ami-d6f2e6bc not found

i couldn't find an instance on AWS called 'ami-d6f2e6bc' do you have a link to one?

Issue with final build

I have followed the setup process but when I run make it gives me this error:
/bin/ld: cannot find -lmpi_cxx
/bin/ld: cannot find libcblas.a
/bin/ld: cannot find libatlas.a

I have installed installed atlas-x86-base

Thanks
Bhavesh

Error: invalid device function launching kernel kScaleAndBias_kernel

Hi All,

I am following the example with movielens and got the error as below:

train -c config.json -i gl_input.nc -o gl_output.nc -n gl.nc -b 256 -e 10

NNNetwork::NNNetwork: 1 input layer
NNNetwork::NNNetwork: 1 output layer
NNWeight::NNWeight: Allocating 13697024 bytes (128, 26752) for weights between Input and Hidden
Error: invalid device function launching kernel kScaleAndBias_kernel
GpuContext::Shutdown: Shutting down cuBLAS on GPU for process 0
GpuContext::Shutdown: CuBLAS shut down on GPU for process 0
GpuContext::Shutdown: Shutting down cuRand on GPU for process 0
GpuContext::Shutdown: CuRand shut down on GPU for process 0
GpuContext::Shutdown: Process 0 out of 1 finalized.

Could you suggest some hints to figure out what did I do wrong?

Thanks a lot.

more details on SparsenessPenalty

in the example of recommend system's conf.josn

{
"Version" : 0.7,
"Name" : "AE",
"Kind" : "FeedForward",
"SparsenessPenalty" : {
"p" : 0.5,
"beta" : 2.0
},

"ShuffleIndices" : false,

"Denoising" : {
    "p" : 0.2
},

"ScaledMarginalCrossEntropy" : {
    "oneTarget" : 1.0,
    "zeroTarget" : 0.0,
    "oneScale" : 1.0,
    "zeroScale" : 1.0
},
"Layers" : [
    { "Name" : "Input", "Kind" : "Input", "N" : "auto", "DataSet" : "gl_input", "Sparse" : true },
    { "Name" : "Hidden", "Kind" : "Hidden", "Type" : "FullyConnected", "N" : 128, "Activation" : "Sigmoid", "Sparse" : true },
    { "Name" : "Output", "Kind" : "Output", "Type" : "FullyConnected", "DataSet" : "gl_output", "N" : "auto", "Activation" : "Sigmoid", "Sparse" : true }
],

"ErrorFunction" : "ScaledMarginalCrossEntropy"

}

For hidden layer,:
"SparsenessPenalty" : # Indicates whether sparseness penalty should be applied (default false)

So, the conf.json which defines
"SparsenessPenalty" : {
"p" : 0.5,
"beta" : 2.0
},
But in the hidden layer, SparsenessPenalty is default value (false)
Will SparsenessPenalty work or not?
And i know beta is a weight, and what about p？
Could you interpreter the neural network more detailed or give some papers?
Thank you!

description of the input data ml20m-all

the dataset format:
1 2,1112486027:29,1112484676:32,1112484819:47,1112484727:50,1112484580:112,1094785740:151,1094785734:223,1112485573:253,1112484940:260,1112484826:293,1112484703:296,1112484767:318,1112484798:337,1094785709:367,1112485980:541,1112484603:589,1112485557:593,1112484661:653,1094785691:919,1094785621:924,1094785598:1009,1112486013:1036,1112485480:1079,1094785665:......

1 is the user id
2,1112486027 29,1112484676 32,1112484819:47 ...... is a pair, which 2 is movie id, and 1112486027 stands for time?

it is different from movie lens dataset in http://grouplens.org/datasets/movielens/

no ratings.

Can i say the userid 1 like movie id 2; but not like movie 3?

Could you interpreter the dataset more detailed？

Thank you!

What the @#$% is this?

In NNNetwork.cpp, the following lines appear:

    if (maxMemory == 131072 ) {
    maxMemory = 138943;
}

This either looks like some debugging code I may have added tracking down the maxSparse bug or something someone kludged in. Is this familiar to anyone?

dockerfile build image error

When I use the dockerfile ,It comes errror just like below:
Step 17 : COPY src /opt/amazon/dsstne/src
lstat src: no such file or directory