Giter Club home page Giter Club logo

Comments (17)

snwagh avatar snwagh commented on August 21, 2024

Indeed, the function is not implemented and thus the bizarre accuracy numbers. I would not have the time to implement this but here's a brief code logic for the simplest way to get this working (parts of which should already be implemented):

  • Use the predict function, which implicitly computes the maximal activation, to get a class prediction.
  • While you load the test data, load the labels into the variable outputData, so that you have access to the true class label.
  • Reconstruct these values (from MPC to plaintext) and compare them (this part is probably already implemented here).

Let me know if you need clarification on any of these. If you manage to complete this, do a pull request and I can merge it to this repo.

from falcon-public.

AndesPooh258 avatar AndesPooh258 commented on August 21, 2024

Thank you for your reply! I have implemented the getAccuracy() function and confirmed the correctness of my algorithm. However, I encountered another issue when I am testing the code with SecureML model and MNIST dataset. In particular, I parsed the MNIST dataset downloaded from the internet and modified the train() function in secondary.cpp to print out the training and testing accuracy after each iteration. Yet, I found that the training and testing accuracy is low and does not move after the first iterations, as shown in the figure below. In addition, I found that the weights and biases of the FC layer of the network does not change even the updateEquation() has called. Is there any possible reason for this issue and how can I fix it? Thanks a lot!
Training result

from falcon-public.

snwagh avatar snwagh commented on August 21, 2024

So I think there must be an issue with the training. 9/11 % indicates that the model is outputting random values so you need to do a bit of parameter tuning.

What is the learning rate you're using? Fixed-point precision? And the weight initialization?

from falcon-public.

AndesPooh258 avatar AndesPooh258 commented on August 21, 2024

I does not change any code in FCLayer.cpp and the hyperparameters provided in global.h (except I changed the NUM_ITERATIONS to 5). So the LOG_LEARNING_RATE is 5, FLOAT_PRECISION is 13, LEARNING_RATE is (1 << (FLOAT_PRECISION - LOG_LEARNING_RATE)), the weight and bias is initialized to all 0.

from falcon-public.

snwagh avatar snwagh commented on August 21, 2024

The weight initialization makes a big difference. Biases set to 0 are fine. For weights, ideally you would like to use Kaiming He initialization but you can use a more "hacky" form of it using something similar to the code provided in FCLayer.cpp

You have to set two things, one generate random "small" values (about 0.001). And second you need to ensure the RSS constraint is met, that each pair of parties shares one of these random values so the randomness generated should be common randomness (for ease, you can set the other two RSS shares to be zero but one of the non-zero RSS share components has to be generated randomly).

from falcon-public.

AndesPooh258 avatar AndesPooh258 commented on August 21, 2024

I have changed the weight initialization to initialize weight randomly by following the idea of FCLayer.cpp. Now the weight of the first two FC layer is updating throughout the iterations. However, the third FC layer is still not updating (I get a weird observation that the deltaWeight variable of the third FC layer of SecureML always remain all 0). As a result, the training and testing accuracy is still very low and flutuating.

I have also tried to include more training and testing data and also different value of LOG_LEARNING_RATE in global.h, but it does not help on improving the accuracy.

train

from falcon-public.

snwagh avatar snwagh commented on August 21, 2024

I think I know what's causing this. The 32 bit space is too small for the entire training. Try setting myType to uint64_t and increasing the fixed point precision to about 20 bits.

from falcon-public.

AndesPooh258 avatar AndesPooh258 commented on August 21, 2024

I changed myType to uint64_t and FLOAT_PRECISION to 20 and observed something like gradient exposion. I also tried to adjust the LOG_LEARNING_RATE from 5 to 19, but the weights and deltaWeights is still in a large value.

train

from falcon-public.

snwagh avatar snwagh commented on August 21, 2024

Can you print all the weights and activations for the first forward and backward pass? The weights seem already to have overflown. With 20 bits of floating precision, any integer above 1000 would result in an overflow. LOG_LEARNING_RATE of 19 seems to be too high.

from falcon-public.

AndesPooh258 avatar AndesPooh258 commented on August 21, 2024

The figure I attached above is using LOG_MINI_BATCH 7 and LOG_LEARNING_RATE 5. Since the size of weights and deltaWeights is quite large, I will try to attach as much as possible. The figure below is the first weights and deltaWeights with LOG_MINI_BATCH 3 and LOG_LEARNING_RATE 5.
weight
deltaweight

from falcon-public.

snwagh avatar snwagh commented on August 21, 2024

I would recommend you don't print the entire sets. Print only the first 10 input, output values (and other values) for each layer. Right now the weights seem fine, deltaWeight doesn't look right but it is hard to say why -- (1) I don't know which layer the above variables are printed (assuming it is SecureML network, is the first, second, or third FCLayer) (2) The other inputs/outputs to this computation particularly, the delta calculation (3) Finally, are you using with or without normalization (if I remember correctly, you probably want the control flow to use this part of the code)?

So print an output (only of the FC layers or since it is a small network you can print all the layers, including the ReLUs) in the following manner that contains the first 10 samples of:
Forward:

  • Input1, weights1, output1
  • Input2 (which should be the same as output1 or a ReLU applied on it), weights2, output2
  • Input3 (which should be the same as output2 or a ReLU applied on it), weights3, output3

Backward:

  • delta3 (computed through NeuralNetwork.cpp), deltaWeight3
  • delta2 (computed through NeuralNetwork.cpp), deltaWeight2
  • delta1 (computed through NeuralNetwork.cpp), deltaWeight1

from falcon-public.

llCurious avatar llCurious commented on August 21, 2024

Hey, i am also working on this part. Could @AndesPooh258 init a PR or put the link to your github repo? Thanks a lot!!

from falcon-public.

AndesPooh258 avatar AndesPooh258 commented on August 21, 2024

I would recommend you don't print the entire sets. Print only the first 10 input, output values (and other values) for each layer. Right now the weights seem fine, deltaWeight doesn't look right but it is hard to say why -- (1) I don't know which layer the above variables are printed (assuming it is SecureML network, is the first, second, or third FCLayer) (2) The other inputs/outputs to this computation particularly, the delta calculation (3) Finally, are you using with or without normalization (if I remember correctly, you probably want the control flow to use this part of the code)?

As I am currently dealing with multiple deadlines, for (1) and (2) I will do the testing as soon as I completed these deadlines. For (3) I am current setting WITH_NORMALIZATION as true.

Hey, i am also working on this part. Could @AndesPooh258 init a PR or put the link to your github repo? Thanks a lot!!

I have made a fork for this repo here.

from falcon-public.

AndesPooh258 avatar AndesPooh258 commented on August 21, 2024

Batch input (first 2000 input):
Input

Forward of FC layer 1 (first 200 elements of weight and activations):
forward FC 1

Forward of FC layer 2 (first 200 elements of weight and activations):
forward FC 2

Forward of FC layer 3 (first 200 elements of weight and activations):
forward FC 3

Update equation of FC layer 1 (first 100 elements of weight, deltas, and deltaWeights):
update FC 1

Update equation of FC layer 2 (first 100 elements of weight, deltas, and deltaWeights):
update FC 2

Update equation of FC layer 3 (first 100 elements of weight, deltas, and deltaWeights):
update FC 3

from falcon-public.

snwagh avatar snwagh commented on August 21, 2024

Good chance the issue is caused because of non-normalized inputs. Can you try after converting the inputs between 0-1 as floats (by default MNIST has 0-255 range values)?

from falcon-public.

AndesPooh258 avatar AndesPooh258 commented on August 21, 2024

I have modified the MNISTParse.c to convert the inputs between 0-1 as floats. Now the weight and activation of the first iterations become normal. However, the deltas and deltaWeight in updateEquation is still very large. Therefore, the weight become large after the first update.

Forward (first 200 elements of weight and activations):
forward

Update Equation (first 200 elements of weight and activations):
update equation

from falcon-public.

snwagh avatar snwagh commented on August 21, 2024

Right, can you now print the weights/input/output of each FCLayer? So until some work is done on automating this/building more software, unfortunately we're stuck with this "looks reasonable" debugging. To help you break the process down further, you can do the following:

  • First check if the forward pass is going fine
  • If it is, check if the error computation (delta computation for the final layer) looks fine. This might likely be the source of error.
  • Then see if the backprop deltas look fine
  • And then finally the actual gradients of the weights

from falcon-public.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.