Giter Club home page Giter Club logo

swift-apis's Introduction

Swift for TensorFlow Deep Learning Library

Get a taste of protocol-oriented differentiable programming.

This repository hosts Swift for TensorFlow's deep learning library, available both as a part of Swift for TensorFlow toolchains and as a Swift package.

Usage

This library is being automatically integrated in Swift for TensorFlow toolchains. You do not need to add this library as a Swift Package Manager dependency.

Use Google Colaboratory

Open an empty Colaboratory now to try out Swift, TensorFlow, differentiable programming, and deep learning.

For detailed usage and troubleshooting, see Usage on the Swift for TensorFlow project homepage.

Define a model

Simply import TensorFlow to get the full power of TensorFlow.

import TensorFlow

let hiddenSize: Int = 10

struct Model: Layer {
    var layer1 = Dense<Float>(inputSize: 4, outputSize: hiddenSize, activation: relu)
    var layer2 = Dense<Float>(inputSize: hiddenSize, outputSize: hiddenSize, activation: relu)
    var layer3 = Dense<Float>(inputSize: hiddenSize, outputSize: 3, activation: identity)
    
    @differentiable
    func callAsFunction(_ input: Tensor<Float>) -> Tensor<Float> {
        return input.sequenced(through: layer1, layer2, layer3)
    }
}

Initialize a model and an optimizer

var classifier = Model()
let optimizer = SGD(for: classifier, learningRate: 0.02)
Context.local.learningPhase = .training
// Dummy data.
let x: Tensor<Float> = Tensor(randomNormal: [100, 4])
let y: Tensor<Int32> = Tensor(randomUniform: [100])

Run a training loop

One way to define a training epoch is to use the gradient(at:in:) function.

for _ in 0..<1000 {
    let 𝛁model = gradient(at: classifier) { classifier -> Tensor<Float> in
        let ŷ = classifier(x)
        let loss = softmaxCrossEntropy(logits: ŷ, labels: y)
        print("Loss: \(loss)")
        return loss
    }
    optimizer.update(&classifier, along: 𝛁model)
}

Another way is to make use of methods on Differentiable or Layer that produce a backpropagation function. This allows you to compose your derivative computation with great flexibility.

for _ in 0..<1000 {
    let (ŷ, backprop) = classifier.appliedForBackpropagation(to: x)
    let (loss, 𝛁ŷ) = valueWithGradient(at: ŷ) { ŷ in softmaxCrossEntropy(logits: ŷ, labels: y) }
    print("Model output: \(ŷ), Loss: \(loss)")
    let (𝛁model, _) = backprop(𝛁ŷ)
    optimizer.update(&classifier, along: 𝛁model)
}

For more models, go to tensorflow/swift-models.

Development

Documentation covering development can be found in the Developer Guide.

Bugs

Please report bugs and feature requests using GitHub issues in this repository.

Community

Discussion about Swift for TensorFlow happens on the [email protected] mailing list.

Contributing

We welcome contributions: please read the Contributor Guide to get started. It's always a good idea to discuss your plans on the mailing list before making any major submissions.

Code of Conduct

In the interest of fostering an open and welcoming environment, we as contributors and maintainers pledge to making participation in our project and our community a harassment-free experience for everyone, regardless of age, body size, disability, ethnicity, gender identity and expression, level of experience, education, socio-economic status, nationality, personal appearance, race, religion, or sexual identity and orientation.

The Swift for TensorFlow community is guided by our Code of Conduct, which we encourage everybody to read before participating.

swift-apis's People

Contributors

8bitmp3 avatar asuhan avatar austinzheng avatar awav avatar bartchr808 avatar bgogul avatar bradlarson avatar brettkoonce avatar compnerd avatar dan-zheng avatar dominikgrewe avatar eaplatanios avatar jekbradbury avatar jon-tow avatar kongzii avatar marcrasi avatar mikowals avatar pschuh avatar rickwierenga avatar rxwei avatar saeta avatar sgugger avatar shashi456 avatar sjaz24 avatar t-ae avatar tanmayb123 avatar texasmichelle avatar vballoli avatar xiejw avatar xihui-wu avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

swift-apis's Issues

Layers without biases - optional weights

Currently the Dense layer must have a bias - there should be a way for the user to specify whether or not they want that bias value. Will something along these lines be possible eventually? https://gist.github.com/tanmayb123/f3631d283cb763ccc30447869efe13e8

Because for now, I think I'd need to something along these lines: https://gist.github.com/tanmayb123/0fd2a935dacb9e012174035ba4aed25f (this compiles successfully)

The second solution isn't very elegant, though.

Error in TransposedConv2d

Refer to comment here.

Even when trying to write test cases for #174 and the transposed conv2d layer, I found a similar error and i reported it there. I think we need to add preconditions and check the layer logic thoroughly.

Implementing options in various Layers

Currently, we define all layers assuming everything to be true by default. For example, bias within a variety of layers could be added as a boolean option, and initialization of the recurrent layers is done by glorotUniform, when more initializations are added, we might also want to offer that flexibility to the user.

Similarly within convolution layers and other layers that are present and are yet to be added might benefit from this added flexibility.

I think we could also maybe have a discussion of whether providing an option makes sense in a few cases, for example if bias is made boolean, then some part of the code becomes redundant.

if bias == true :
    b_ih = Tensor( glorotUniform : shape, seed)
.
.
earlier : 
op = tanh(weight + bias)

now : 
op = weight 
if bias = true:
     op + = bias 
output = tanh(op)

Sometimes options make the code repetitive and bloated, So i'm wondering if there is any work to around this.

#38 also talks about this problem in similar lines.

Learning with GPU docs

It's unclear how does GPU/TPU support work in swift for tensorflow. I haven't found any docs on this topic.

In PyTorch I am able to cast some tensor from cpu to gpu by calling special method .to(device) like x.to(t.device("cuda:2")) in order to cast x to gpu with id equal to 2.

How does this work in swift for tf? I noticed that I could change env variable in collab from None to GPU and all computations would be performed on GPU, but it's still unclear how could I do something like that using compiler.

I believe that learning models with GPU/GPUs is very important part of deep learning pipeline and thus it have to be in docs in any form.

Add more optimizers and losses

Just a small-ish roadmap to different Optimizers and losses we can look at to add :

Optimizers:-

  • Adam
  • Adagrad
  • SGD
  • RMSprop
  • AdaDelta
    - [x] Riemann SGD Removed by #311
  • Adabound ->ICLR 2019 -> Repo
  • LBFGS
  • AdaMax
  • SparseAdam

Losses :-

  • L1Loss
  • L2Loss
  • MeanSquaredError
  • SoftmaxCrossEntropy
  • SoftmaxCrossEntropyWithLogits
  • SigmoidCrossEntropy
  • MeanAbsoluteError
  • MeanAbsolutePercentageError
  • MeanSquaredLogarithmicError
  • HingeLoss
  • SquaredHinge
  • CategoricalHinge
  • Logcosh
  • CategoricalCrossEntropy
  • Kullback Leibler Divergence
    NegativeLogLikelihood
  • Cosine
  • TripletMarginLoss
  • Poisson

Removed NegativeLogLikelihood. Check this discussion for more.

Note: If you have suggestions for losses and optimizers, please suggest them in this issue.

Implement Convolution Transpose Layer

While a Convolutional layer already is implemented in the existing layers, if we had a transposed version we could start to build up interesting architectures such as GANs and VAEs.

Wrapper to enable recurrent cells to handle sequential input

In Flux (machine learning for Julia) you can wrap an RNN Cell in a Recur type so it can keep track of the hidden state for you. Here's an example from their code:

accum(h, x) = (h+x, x)
rnn = Flux.Recur(accum, 0)
rnn(2) # 2
rnn(3) # 3
rnn.state # 5
rnn.(1:10) # apply to a sequence
rnn.state # 60

Would it be valuable to have something similar with swift-apis?

Issue with applied(to:in:) differentiation

I've got this simple code:

import TensorFlow
enableGPU()

public struct test: Layer {

    @differentiable(wrt: (self, input))
    public func applied(to input: Tensor<Float>, in context: Context) -> Tensor<Float> {
        return input * 5
    }

}

public struct Generator: Layer {

    public typealias Input = Tensor<Float>
    public typealias Output = Tensor<Float>

    let tester = test()

    @differentiable(wrt: (self, input))
    public func applied(to input: Tensor<Float>, in context: Context) -> Tensor<Float> {
        let o1 = tester.applied(to: input, in: context)
        return o1
    }

}

However, I get the following error:

main.swift:20:27: error: can only differentiate with respect to parameters that conform to 'Differentiable', but 'Generator' does not conform to 'Differentiable'
    @differentiable(wrt: (self, input))
                          ^
main.swift:13:15: error: type 'Generator' does not conform to protocol 'Layer'
public struct Generator: Layer {
              ^
main.swift:21:17: note: candidate is missing attribute '@differentiable(wrt: (self, input))'
    public func applied(to input: Tensor<Float>, in context: Context) -> Tensor<Float> {
                ^
main.swift:13:15: error: type 'Generator' does not conform to protocol '__Differentiable'
public struct Generator: Layer {
              ^
Layer.swift:67:10: note: protocol requires function 'applied(to:in:)' with type '(Generator.Input, Context) -> Generator.Output' (aka '(Tensor<Float>, Context) -> Tensor<Float>'); do you want to add a stub?
    func applied(to input: Input, in context: Context) -> Output
         ^
Swift.__Differentiable:2:20: note: protocol requires nested type 'TangentVector'; do you want to add it?
    associatedtype TangentVector : AdditiveArithmetic
                   ^
Swift.__Differentiable:3:20: note: protocol requires nested type 'CotangentVector'; do you want to add it?
    associatedtype CotangentVector : AdditiveArithmetic
                   ^
Swift.__Differentiable:4:20: note: protocol requires nested type 'AllDifferentiableVariables'; do you want to add it?
    associatedtype AllDifferentiableVariables : Differentiable
                   ^
main.swift:13:15: error: type 'Generator' does not conform to protocol 'Differentiable'
public struct Generator: Layer {
              ^
main.swift:13:15: error: type 'Generator' does not conform to protocol '_Differentiable'
public struct Generator: Layer {
              ^

Did I do something wrong here?

flattened() has not been marked '@differentiable'

I have some code like this:

let logits = Tensor<Float>(randomUniform: [1, 10]) 
let gold = Tensor<Float>(randomUniform: [10]) 

logits.valueWithGradient { yhat in softmaxCrossEntropy(logits: yhat.flattened(), oneHotLabels: gold) }

I received the following error:

error: <Cell 80>:4:26: error: function is not differentiable
logits.valueWithGradient { yhat in softmaxCrossEntropy(logits: yhat.flattened(), oneHotLabels: gold) }
                         ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

<Cell 80>:4:69: note: cannot differentiate an external function that has not been marked '@differentiable'
logits.valueWithGradient { yhat in softmaxCrossEntropy(logits: yhat.flattened(), oneHotLabels: gold) }
                                                                    ^

Could anyone take a look at this?

Simple Optimizer Test

Hey S4SF team, I am trying a very simple AD exercise using Adam. Since Adam requires a Layer type I created this very simple Variable struct that just holds the state. In this example I am just trying to minimize x^2:

import TensorFlow

struct Variable : Layer {
    var x: Float

    @differentiable
    func call(_ input: Float) -> Float {
        return x 
    }
}

var layer = Variable(x: 1.0)
let optimizer = Adam(for: layer)

let delta = layer.gradient { layer -> Float in
    let x = layer.x
    return x * x
}

print("Layer0:", layer)
print("Delta:", delta)

optimizer.update(&layer.allDifferentiableVariables, along: delta)

print("Layer1:", layer)

It appears that after update layer is not changing its x value:

Layer0: Variable(x: 1.0)
Delta: AllDifferentiableVariables(x: 2.0)
Layer1: Variable(x: 1.0)

Any idea why this not minimizing x?

RNNCell output: `Output` and `State`?

There's something that's really bugging me about the RNNCell protocol: that TimeStepOutput is separate from the State. RNN Cells, at least as far as I've seen them in other languages, will always only return the hidden state, and it's up to another Dense layer (or some counterpart) to make process the hidden state.

In that case, why keep them separate? Why not just say that RNN Cells just return a State?

(please do let me know if I'm overlooking something)

Conv2D Signature and initialization

Reading the Conv2D Documentation I noticed two issues:

  1. The order of the arguments is not consistent across all the methods:
  • (filter, bias, strides, padding, activation) (which I prefer)
  • (filter, bias, activation, strides, padding).
  1. The use of a canned initialization (glorotUniform) in the Api where you give only the seed / generator. I think that would be convenient to have a protocol that defines the initialization strategy for the Layer:
protocol InitializationStrategy {
   func Filter(TensorShape)
   func Bias(TensorShape)
}
    convenience init(
        filterShape: (Int, Int, Int),
        stride: Int = 1,
        padding: Padding = .valid,
        activation: @escaping Activation = identity,
        initializator: InitializationStrategy,
    ) {
        let filterTensorShape = TensorShape([
            filterShape.0, filterShape.1, filterShape.2])
        self.init(
            filter: initializator.Filter(shape: filterTensorShape),
            bias: initializator.Bias(shape: TensorShape([filterShape.2])),
            stride: stride,
            padding: padding,
            activation: activation)
    }

or even a more generic one to allow a single protocol for all initializations:

protocol InitializationStrategy {
   func Values(for: Strings, shape: TensorShape) -> Tensor<Float>
}

to be used it as

       self.init(
            filter: initializator.Values( for: "Conv2D/filter", shape: filterTensorShape),
            bias: initializator.Values(for: "Conv2D/bias", shape: TensorShape([filterShape.2]))
        )

Fix failing tests on Linux.

TrivialModelTests.testXOR has been failing on the Linux CI machine. It does not fail on macOS. This also seems related to the case where -O is enabled on the TensorFlow module in apple/swift#25166.

Test Case 'TrivialModelTests.testXOR' started at 2019-05-31 07:16:34.843
931
Exited with signal code 11
932
The command '/bin/sh -c /swift-tensorflow-toolchain/usr/bin/swift test' returned a non-zero code: 1

Adding Test Cases for Layers in Layer.swift file

There are a lot of layers which dont have test cases. I will start with adding test cases for Global Average Pooling and Global Max pooling, yet many are left. I will also try to make a list of which ones do and which ones do not.

Also i thought this could be a good first issue if someone wanted to contribute.

Multiple return parameters

I'm trying to implement a simple/limited version of the meshgrid op. This is what I've got:

func meshgrid(x: Tensor<Float>, y: Tensor<Float>) -> (Tensor<Float>, Tensor<Float>) {
    let outputX = x.reshaped(to: [-1, 1])
    let outputY = y.reshaped(to: [-1, 1])
    let multFactX = Tensor<Float>(ones: [x.scalarCountTensor.scalarized()])
    let multFactY = Tensor<Float>(ones: [y.scalarCountTensor.scalarized()])
    return ((outputX * multFactX).transposed(), outputY * multFactY)
}

Of course, that can't be made differentiable because tuples aren't differentiable. So, I had to implement this:

struct TensorPair<T: TensorFlowFloatingPoint>: Differentiable {

    var first: Tensor<T>
    var second: Tensor<T>

    @differentiable
    init(_ first: Tensor<T>, _ second: Tensor<T>) {
        self.first = first
        self.second = second
    }

}

@differentiable
func meshgrid(x: Tensor<Float>, y: Tensor<Float>) -> TensorPair<Float> {
    let outputX = x.reshaped(to: [-1, 1])
    let outputY = y.reshaped(to: [-1, 1])
    let multFactX = Tensor<Float>(ones: [x.scalarCountTensor.scalarized()])
    let multFactY = Tensor<Float>(ones: [y.scalarCountTensor.scalarized()])
    return TensorPair((outputX * multFactX).transposed(), outputY * multFactY)
}

It works, but it's not a very elegant solution. Thoughts:

  1. Will tuples ever be differentiable?
  2. This could be made more a bit more elegant by returning an array of tensors with the two values.

Numpy like slice

Hey, I am having a lot of fun learning Swift via S4TF and reading the new features you guys are implementing, the project seems to be moving very fast :)

I was a little worried when I tried to do a simple slice and found out that it was more complicated than I expected. To try to make things more familiar I implemented a Slice enum and extended the Tensor struct with a new slice method that takes Slice... args in such a way that it behaves like numpy. Here is an example with its corresponding python equivalent:

x.slice(.all, .at(1))  // x[:, 1]
x.slice(.all, .upto(3))  // x[:, :3]
x.slice(.all, .from(2))  // x[:, 2:]
x.slice(.all, .slice(1, 5))  // x[:, 1:5]
x.slice(.rest, .at(0))  // x[..., 0]

It works for Tensors of any shape and any slice combination. It would be awsome if swift eventually supported complex subscripting so it could be implemented natively like this:

x[..., 1]  // x[:, 1]
x[..., ..<3]  // x[:, :3]
x[..., 2...]  // x[:, 2:]
x[..., 1..<5]  // x[:, 1:5]
x[???, 0]  // x[..., 0]  <-- no idea about this one

I thought this might be useful to others or maybe it can even get integrated into to project. Here is the code if you are interested:

enum Slice : Equatable {
    case slice(Int, Int)
    case at(Int)
    case from(Int)
    case upto(Int)
    case all
    case rest

    func get_range(_ n: Int) -> (Int, Int) {
        var low: Int = 0
        var high: Int = n

        switch self {
            case .slice(let _low, let _high):
                low = _low
                high = _high
            case .at(let value):
                low = value
                high = value + 1
            case .from(let _low):
                low = _low
            case .upto(let _high):
                high = _high
            default:
                break 
        }


        if low < 0 {
            low += n
        }

        if high < 0 {
            high += n
        }


        return (low, high)
    }
}

enum SliceError : Error {
    case MultipleRest
    case TooManySlices
}

extension Tensor {

    func slice(_ slices: Slice...) throws -> Tensor {
        var slices = slices
        let ndim = Int(self.shape.count)

        if slices.count > ndim {
            throw SliceError.TooManySlices
        }

        if slices.count < ndim && !slices.contains(.rest) {
            slices.append(.rest)
        }

        let restCount = slices.filter { $0 == .rest }.count
        let notRestCount = slices.count - restCount
        let restExpand = ndim - notRestCount

        if restCount > 1 {
            throw SliceError.MultipleRest
        }

        if restExpand > 0 {
            let allSlices = Array(repeating: Slice.all, count: restExpand)
            let idx = slices.firstIndex(of: .rest)!

            slices = slices[..<idx] + allSlices + slices[(idx+1)...]
        }

        var lowerBounds: [Int32] = []
        var upperBounds: [Int32] = []

        for (slice, n) in zip(slices, self.shape.dimensions) {
            var (lower, upper) = slice.get_range(Int(n))
            
            if lower < 0 {
                lower = 0
            }

            if upper > n {
                upper = Int(n)
            }
            
            lowerBounds.append(Int32(lower))
            upperBounds.append(Int32(upper))
        }

        let squeezeDims = slices.enumerated()
            .filter { 
                if case .at(_) = $1 { return true } 
                else { return false } 
            }
            .map { $0.0 }
            .reversed()
        
        var output = self.slice(lowerBounds: lowerBounds, upperBounds: upperBounds)

        for i in squeezeDims {
            output = output.squeezingShape(at: Int32(i))
        }

        return output
    }
}

Update README to reflect the new role and structure of the library

The entire TensorFlow library has been moved from apple/swift to this repository, and the original DeepLearning module has been renamed to TensorFlow. One important remaining task is to change the README of this repo to describe the TensorFlow library as a whole with examples, and explain the structure of this new restructured repo. After that, send an email to [email protected] summarizing this change. This task is on me later this week.

'SinModel.AllDifferentiableVariables' and 'SinModel.CotangentVector' must be equivalent

I'm working on an example with @brettkoonce, which is an RNN that can predict the sin function. This is the code I've got so far: https://gist.github.com/tanmayb123/366bc39caa18d71ed7ea0bd1d07e0a79

I'm getting this error:

main.swift:41:17: error: 'Adam' requires the types 'SinModel.AllDifferentiableVariables' and 'SinModel.CotangentVector' be equivalent
var optimizer = Adam<SinModel>(for: model)
                ^
Optimizer.swift:48:14: note: requirement specified as 'Model.AllDifferentiableVariables' == 'Model.CotangentVector' [with Model = SinModel]
public class Adam<Model: Layer>: Optimizer
             ^

I'm 99% sure this is being caused by the RNN<Cell: RNNCell> wrapper. How can we fix this?

I'd love your thoughts on this @rxwei.

Math Protocols

@rxwei @dan-zheng I wasn't sure where to put this, but I believe an issue here is a good place to collect our thoughts and comments. My initial thoughts are:

Pointwise Multiplicative

I have a couple comments about PointwiseMultiplicative:

  1. Similar to how AdditiveArithmetic defines - and -=, I believe we should define / and /= for PointwiseMultiplicative thus enabling efficient division and making this dual to AdditiveArithmetic. It may not be very formal, but given that most of the math-related protocols are not and are more geared towards practicality I think this is fine. Also, for our purposes these protocols are used over aggregates of tensors where / and /= can defined so this change should be fine for our purposes. What do you think?
  2. Following from point 1, if we aim for consistency with the standard library we may want to call this MultiplicativeArithmetic, or rename AdditiveArithmetic to PointwiseAdditive. I personally prefer the latter since it will also allow for consistency with e.g. PointwiseComparative, but not sure how that would go with the Swift evolution process.

Optimizers

In order to simplify the remaining optimizers we need to add support for comparisons (e.g., max) and for computing the absolute value of tensors element-wise.

For comparisons, I believe something along the lines of the following would be great:

public protocol PointwiseEquatable {
  associatedtype Boolean
  static func == (lhs: Self, rhs: Self) -> Boolean
}

public protocol PointwiseComparable: PointwiseEquatable {
  static func < (lhs: Self, rhs: Self) -> Boolean
  static func <= (lhs: Self, rhs: Self) -> Boolean
  static func > (lhs: Self, rhs: Self) -> Boolean
  static func >= (lhs: Self, rhs: Self) -> Boolean
  static func max(lhs: Self, rhs: Self) -> Boolean
  static func min(lhs: Self, rhs: Self) -> Boolean
}

I'm not sure about the absolute value, but I believe we may be able to do something like:

public protocol PointwiseMagnitude { // ???
  func abs() -> Self
}

Reductions

We need some way to perform reductions over tensor aggregates. This comes up quite a lot in machine learning. For example, we often want to know the max over all elements in an aggregate. Or, for a more practical motivating example consider clipping gradients based on the global norm over the aggregate structure. This would require us to compute the norm of each tensor in the aggregate (norm[t]) and then compute:

globalNorm = sqrt(sum([norm[t].squared() for t in tensors]))

Say we can compute sqrt(_:) and squared() using a conformance to ElementaryFunctions. How do we go about the sum reduction over the aggregate?

Adding support for reductions introduces a couple of challenges. First, we would need to know the Scalar type of all tensors in the structure and force it to be the same for all. Alternatively, we can follow a similar approach to VectorProtocol and use Float for all tensors. However, in that case wouldn't we lose precision when dealing with say Double tensors (this problem also applies to VectorProtocol actually so what how do you handle it there?)? We could avoid this by having a Scalar type (which would also require all layers to define a Scalar type -- @rxwei you mentioned though that we want to avoid this to potentially allow for mixed-precision training). In either case, I believe this is worth a discussion.

Also, reducing over an aggregate would require a protocol that looks something like this:

public protocol Reducible {
  associatedtype Scalar

  func sum() -> Scalar where Scalar: AdditiveArithmetic
  func mean() -> Scalar where Scalar: AdditiveArithmetic
  func product() -> Scalar where Scalar: PointwiseMultiplicative

  // ... more reductions such as comparison-based reductions.

  // This needs to be used by the `_meanHelper()` for example.
  func count() -> Scalar

  // The following are needed for applying the reduction across the reduced members.
  static func _sumHelper(_ x: Scalar, _ y: Scalar) -> Scalar where Scalar: AdditiveArithmetic
  static func _meanHelper(_ x: Scalar, _ y: Scalar) -> Scalar where Scalar: AdditiveArithmetic
  static func _productHelper(_ x: Scalar, _ y: Scalar) -> Scalar where Scalar: PointwiseMultiplicative
}

This seems overly complicated so maybe we can find a better solution? One nice thing about using a Scalar type is that it may remove the need for a Reducible protocol by allowing users to perform reductions manually using KeyPathIterable. For example, my current implementation for clipping by global norm look like this:

extension KeyPathIterable {
  public mutating func clipByGlobalNorm<Scalar: TensorFlowFloatingPoint>(clipNorm: Scalar) {
    let clipNorm = Tensor<Scalar>(clipNorm)
    var globalNorm = Tensor<Scalar>(zeros: [])
    for kp in self.recursivelyAllWritableKeyPaths(to: Tensor<Scalar>.self) {
      globalNorm += self[keyPath: kp].squared().sum()
    }
    globalNorm = sqrt(globalNorm)
    for kp in self.recursivelyAllWritableKeyPaths(to: Tensor<Scalar>.self) {
      self[keyPath: kp] *= clipNorm / max(globalNorm, clipNorm)
    }
  }
}

Of course it doesn't have to be defined as an extension to KeyPathIterable, but I use this for now because I cannot yet define it as an extension to Layer.TangentVector.

What are your thoughts on the above? Also, why do we call VectorProtocol that instead of VectorSpace?

Correct 'XCTAssertEqual(_:_:)' argument order.

#109 moved all TensorFlow APIs from apple/swift to this repository. However, some test assertions weren't updated because the argument order of the expectEqual(_:_:) function is different from that of XCTAssertEqual(_:_:) in that expectEqual(_:_:) expected the first argument to be the expected value, while XCTAssertEqual(_:_:) expected the second argument.

Here's a concrete example of a test assertion that used the old order:

XCTAssertEqual(Tensor(30), x.sum())

It should be changed to:

XCTAssertEqual(x.sum(), Tensor(30))

Task

Go through all tests (especially the ones in OperatorTests/) and make sure XCTAssertEqual(_:_:) arguments are ordered correctly: The first argument should be the value to be tested, and the second argument should be the expected value.

swift-build errors

so I did build the master before and after, and swift build throws these errors :

Compile Swift Module 'DeepLearning' (8 sources)
/home/paws/swift-apis/Sources/DeepLearning/Layer.swift:51:6: error: function result's 'pullback' type does not match 'inferring(from:)'
    @differentiating(inferring(from:))
     ^
/home/paws/swift-apis/Sources/DeepLearning/Layer.swift:54:38: note: 'pullback' does not have expected type '(Self.Output.CotangentVector) -> (Self.CotangentVector, Self.Input.CotangentVector)'
        -> (value: Output, pullback: (Output.TangentVector)
                                     ^~~~~~~~~~~~~~~~~~~~~~
/home/paws/swift-apis/Sources/DeepLearning/Layer.swift:46:10: note: 'inferring(from:)' defined here
    func inferring(from input: Input) -> Output {
         ^
/home/paws/swift-apis/Sources/DeepLearning/Layer.swift:1428:46: error: '_vjpCall(_:initialState:)' does not have expected type '<Cell where Cell : RNNCell> (RNN<Cell>) -> ([Cell.TimeStepInput], Cell.State) -> ([Cell.TimeStepOutput], (Array<Cell.TimeStepOutput.CotangentVector>.DifferentiableView) -> (RNN<Cell>.CotangentVector, Array<Cell.TimeStepInput.CotangentVector>.DifferentiableView))'
    @differentiable(wrt: (self, input), vjp: _vjpCall(_:initialState:))
                                             ^
/home/paws/swift-apis/Sources/DeepLearning/Layer.swift:77:22: error: cannot convert return expression of type '(Self.Output.CotangentVector) -> (Self.CotangentVector, Self.Input.CotangentVector)' to return type '(_) -> (layerGradient: _, inputGradient: _)'
        return (out, pullback)
                     ^~~~~~~~
                              as! (_) -> (layerGradient: _, inputGradient: _) 

Make 'gathering(atIndices:alongAxis:)' generic over 'Int32' and 'Int64'.

We need to make Tensor.gathering(atIndices:alongAxis:) support Int64 indices, but we cannot do so by making the index tensor's Scalar type be generic over BinaryInteger because the TensorFlow Gather operator (like many other operators that take indices) only supports Int32 and Int64 indices.

To resolve this, we should consider adding a TensorIndexScalar protocol and conform only Int32 and Int64 to it.

public protocol TensorIndexScalar: BinaryInteger & TenosrFlowScalar {
    ...
}

unknown attribute 'frozen' @frozen

While building swift api package on macos getting this error.

PS, I am using Xcode 10 | v0.3.1 | April 30, 2019 keychain binding. Let me know what I am missing.

Compile Swift Module 'TensorFlow' (36 sources) /Users/rajat/Documents/research/swift_tensorflow/swift_dl/swift_dl/Sources/TensorFlow/Core/DataTypes.swift:149:2: error: unknown attribute 'frozen' @frozen ^ /Users/rajat/Documents/research/swift_tensorflow/swift_dl/swift_dl/Sources/TensorFlow/Core/ShapedArray.swift:428:2: error: unknown attribute 'frozen' @frozen ^ /Users/rajat/Documents/research/swift_tensorflow/swift_dl/swift_dl/Sources/TensorFlow/Core/ShapedArray.swift:822:2: error: unknown attribute 'frozen' @frozen

Another issue with _vjpConv2DBackpropInput

After fixing issue #329, I've encountered another issue during backpropagation that results in the following type of error:

Fatal error: Incompatible shapes: [1,32,32,64] vs. [4,4,32,64]: file /Users/stephenjohnson/Projects/Conv2dTransposeTest/.build/checkouts/swift-apis/Sources/TensorFlow/Bindings/EagerExecution.swift, line 299
Illegal instruction: 4

It took me a while to track down, but I believe it is because conv2DBackpropInput has differentiable order of wrt: (x, filter)

/// TensorFlow builtin conv2d gradient helper for the input.
@differentiable(wrt: (x, filter), vjp: _vjpConv2DBackpropInput)
@usableFromInline
func conv2DBackpropInput<Scalar: TensorFlowFloatingPoint>(

but _vjpConv2DBackpropInput is returning a tuple where conv2DBackpropFilter call is the first tuple item and the conv2D call is the second tuple item.

return (value, { v in
    (conv2DBackpropFilter(x, input: v, filterSizes: shape, strides: strides,
                          padding: padding, dilations: dilations),
     conv2D(v, filter: filter, strides: strides, padding: padding, dilations: dilations))

When I switch them around, everything appears to work fine. I'm going to submit a PR.

Recursive Neural Networks (structured data/trees)

I can't wait for AutoDiff to support control flow. On that topic, I had a quick question: how easy/difficult would it be to implement recursive (not recurrent) neural networks? (https://stackoverflow.com/questions/26022866/how-can-a-tree-be-encoded-as-input-to-a-neural-network)

Could I do something like this (this is a hastily written sample only): https://gist.github.com/tanmayb123/c57c956120d3733ed0691c6406a74284

Is there anything specific in that implementation that would prevent it from working? Is there any other/a better way to do BPTS (Backprop through Structure)?

Rename and refactor convolution and pooling operators.

We've decided to rename existing convolution and pooling methods to something more approachable and discoverable, following precedents in other frameworks.

Note:

  1. The new names we recommend are precisely described above. Further naming discussions are absolutely welcome, but will be preferable if delayed until July/August. These new names will make it into Swift for TensorFlow version 0.4.
  2. These methods' derivatives (VJPs) will need to be changed to top-level functions as well.

Remove uses of assert(false)

assert(false) should not be used as a terminator because it gets compiled away in release builds. fatalError or preconditionFailure should be used instead. There are two occurrences of assert(false) and they have broken release builds of swift-apis, both in LazyTensorOperations.swift.

            assert(false, "TODO: to be send out in a separate PR.")
            // return op.materialized(index: index)
        default: assert(false, "Unhandled type: \(cDataType)")

Swift Code Style

Can we use swiftlint to standardize the format for the repo? This will avoid reviews about formatting errors and speed up the overall process.

Breaking down the Layers.swift file

So the layer.swift file is long and since we are actively working on adding layers #54 , it might make more sense to break down the file into convolutions, pooling, recurrence and other such files. Might make the code overall more readable.
@rxwei @tanmayb123 Thoughts?

Retrieve layer/model weights

There should be an easy way to extract the weights of a Layer/Model and save them. Currently, I've got the not-so-pretty method of doing this:

public func weights() -> [Tensor<Float>] {
    return [dense1.weight, dense1.bias, dense2.weight, dense2.bias, dense3.weight, dense3.bias, dense4.weight, dense4.bias]
}

Originally, I thought of doing something like adding a weights() -> [Tensor<Scalar>] function to the Layer protocol. For example, Dense<Scalar: TensorFlowFloatingPoint> would implement it like so:

public func weights() -> [Tensor<Scalar>] {
    return [weight, bias]
}

My layer could then do this:

public func weights() -> [Tensor<Scalar>] {
    return dense1.weights() +
           dense2.weights() +
           dense3.weights() +
           dense4.weights()
}

Is there an easier way to do this?

Implement Recurrent Layers

It would be really nice to have a set of recurrent layers including LSTMs and GRUs as part of the core layers API. Ideally we can parameterize the activation functions for the various gates rather than hard coding in the tanh and sigmoid functions.

_vjpConv2DBackpropInput incorrectly using shape as filterSizes argument

_vjpConv2DBackpropInput is incorrectly passing the shape argument to conv2DBackpropFilter as the filterSizes argument instead of passing the filter shape, this is causes the following error when attempting to compute gradients for TransposedConv2D layer.

Fatal error: Conv2DCustomBackpropFilter: input depth must be evenly divisible by filter depth: file /Users/danielzheng/swift-tf/tensorflow-swift-apis/Sources/TensorFlow/Bindings/EagerExecution.swift, line 299 Illegal instruction: 4

Suppress LazyTensor test compilation warnings

[46/56] Compiling TensorFlowTests LazyTensorTests.swift
1051
/swift-apis/Tests/TensorFlowTests/LazyTensorTests.swift:79:21: warning: 'let' pattern has no effect; sub-pattern didn't bind any variables
1052
            if case let .symbolic(_) = t.handle {
1053
                    ^~~~~~~~~~~~~~~~

Provide a way to access non-differentiable model state from `Optimizer.update`.

@dominikgrewe pointed out that K-FAC needs to access the model's non-differentiable internal state from Optimizer.update(_:along:). My initial idea is to change the optimizer API to the following:

public protocol Optimizer {
    associatedtype Model: Layer
    associatedtype Scalar: FloatingPoint
    var learningRate: Scalar { get }
    mutating func update(_ variables: inout Model, along gradient: Model.CotangentVector)
}

But this comes at the cost of complicating concrete optimizers' implementation of update(_:along:):

  • Too long: model.allDifferentiableVariables.recursivelyAllWritableKeyPaths(to: Tensor<Scalar>.self).
  • Too too long: model.allDifferentiableVariables[keyPath: kp] -= stepSize * firstMoments[keyPath: kp] / (sqrt(secondMoments[keyPath: kp]) + epsilon).

And we obviously can't assign secondMoments.allDifferentiableVariables to a local variable because we need setter access. Something like inout var modelVariables = model.allDifferentiableVariables is not possible in Swift yet until the ownership model and related things get fleshed out.

Another issue is that making model be inout is not semantically accurate: we don't want to mutate a model's non-differentiable states in an optimizer.

Maybe what we really need is optimizer-specific protocols that require model states, which models will implement. The specific optimizers will have such a generic constraint, and take these states as initializer parameters.

Indexing Assignment error on Ubuntu 18.04 nightly toolchain (v0.4)

This issue was discovered by @jon-tow and sent to [email protected].

Indexing assignment no longer works when upgrading to the most recent
toolchain nightly available: v0.4 - Ubuntu 18.04 (CUDA 10).

After entering the code below into the swift repl:

var x = Tensor<Float>(shape: [1, 2, 2], scalars: [0, 1, 2, 3])
x[0] = Tensor<Float>(shape: [2, 2], scalars: [10, 20, 30, 40])

the following error occurs:

Fatal error: slice index 0 of dimension 0 out of bounds.: file /swift-base/tensorflow-swift-apis/Sources/TensorFlow/Bindings/EagerExecution.swift, line 299
Current stack trace:
0    libswiftCore.so                    0x00007ffff7c13860 swift_reportError + 50
1    libswiftCore.so                    0x00007ffff7c825d0 _swift_stdlib_reportFatalErrorInFile + 115
2    libswiftCore.so                    0x00007ffff7baaa7e <unavailable> + 3738238
3    libswiftCore.so                    0x00007ffff7baabf7 <unavailable> + 3738615
4    libswiftCore.so                    0x00007ffff7978bdd <unavailable> + 1436637
5    libswiftCore.so                    0x00007ffff7b7fa28 <unavailable> + 3562024
6    libswiftCore.so                    0x00007ffff7978039 <unavailable> + 1433657
7    libswiftTensorFlow.so              0x00007ffff4d893e0 <unavailable> + 2663392
8    libswiftTensorFlow.so              0x00007ffff4bee560 checkOk(_:file:line:) + 461
9    libswiftTensorFlow.so              0x00007ffff4bf5690 TFE_Op.evaluateUnsafe() + 506
10   libswiftTensorFlow.so              0x00007ffff4bf5f00 TFE_Op.execute<A>(_:) + 132
11   libswiftTensorFlow.so              0x00007ffff4bfeb94 <unavailable> + 1047444
Execution interrupted. Enter code to recover and continue.
Enter LLDB commands to investigate (type :help for assistance.)
Process 3540 stopped
* thread #1, name = 'repl_swift', stop reason = signal SIGILL: illegal instruction operand
    frame #0: 0x00007ffff7b7fa35 libswiftCore.so`function signature specialization <Arg[0] = Exploded, Arg[1] = Exploded> of Swift._assertionFailure(_: Swift.StaticString, _: Swift.String, file: Swift.StaticString, line: Swift.UInt, flags: Swift.UInt32) -> Swift.Never + 501
libswiftCore.so`function signature specialization <Arg[0] = Exploded, Arg[1] = Exploded> of Swift._assertionFailure(_: Swift.StaticString, _: Swift.String, file: Swift.StaticString, line: Swift.UInt, flags: Swift.UInt32) -> Swift.Never:
->  0x7ffff7b7fa35 <+501>: ud2    
    0x7ffff7b7fa37 <+503>: movl   %r15d, %ecx
    0x7ffff7b7fa3a <+506>: shrl   $0x6, %ecx
    0x7ffff7b7fa3d <+509>: movl   %r15d, %eax
Target 0: (repl_swift) stopped.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.