Giter Club home page Giter Club logo

prajna's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

prajna's Issues

setup an cluster in azure

Looking at the examples it looks like one can pass in a PrajnaClusterFile

I can not see a sample Prajna Cluster File.

Also, it is not clear how to setup a cluster in a cloud provider like Azure, any guidance would be appreciated.

Using DSet<bool> cause exception at Prajna.Core.MetaFunction`1.EncodeFunc

For code like

    let numPartitions = 4
    let guid = Guid.NewGuid().ToString("D")
    let d = DSet<_> ( Name = guid, Cluster = cluster)
           |> DSet.sourceI numPartitions (fun i -> seq { for i = 0 to 9 do 
                                                            if i % 2 = 0 then yield true else yield false})

    let r = d.ToSeq() |> Array.ofSeq

When run it with local cluster, an exception was thrown in serializer.

The exception is

  System.ArgumentException: Offset and length were out of bounds for the array or count is greater than the number of elements from index to the end of the source collection

The stack is

   at System.Buffer.BlockCopy(Array src, Int32 srcOffset, Array dst, Int32 dstOffset, Int32 count)
   at Prajna.Tools.BufferListStream`1.SrcDstBlkCopy[T1,T2,T1Elem,T2Elem](T1 src, Int32& srcOffset, Int32& srcLen, T2 dst, Int32& dstOffset, Int32& dstLen) in C:\GitHub\Prajna\src\tools\tools\bufferliststream.fs:line 994
   at Prajna.Tools.BufferListStream`1.WriteArrT(Array buf, Int32 offset, Int32 count) in C:\GitHub\Prajna\src\tools\tools\bufferliststream.fs:line 1353
   at Prajna.Tools.Serializer.writeMemoryBlittableArray(Type elType, Array arrObj, MemoryStream memStream) in C:\GitHub\Prajna\src\tools\tools\serialize.fs:line 348
   at <StartupCode$Prajna-Tools>[email protected](Tuple`4 tupledArg) in C:\GitHub\Prajna\src\tools\tools\serialize.fs:line 379
   at Prajna.Tools.Serializer.WriteArray(Type arrayType, Array arrObj) in C:\GitHub\Prajna\src\tools\tools\serialize.fs:line 410
   at Prajna.Tools.BinarySerializer.System-Runtime-Serialization-IFormatter-Serialize(Stream stream, Object graph) in C:\GitHub\Prajna\src\tools\tools\serialize.fs:line 745
   at Prajna.Core.MetaFunction`1.EncodeFunc(BlobMetadata meta, U[] elemArray) in C:\GitHub\Prajna\src\CoreLib\function.fs:line 275
   at Prajna.Core.MetaFunction`1.EncodeFuncFromObj(BlobMetadata meta, Object o) in C:\GitHub\Prajna\src\CoreLib\function.fs:line 253

Support concurrent read from remote DSet

The current implementation doesn’t support the scenario when a DSet is concurrently read from two different threads. Effectively it means one cannot have two enumerators over the same DSet at the same time.

executable not generated

hi, I built Prajna from master branch with VisualStudio. I start client without options. and I created an app, but it cannot generate any output. I checked the log, it says the process start failed, and I located the source, and try to print the executable name:

    let StartChild(startInfo : ProcessStartInfo) =
        if (CommonJob <> IntPtr.Zero) then
            printfn "---> %A" startInfo.FileName
            let proc = Process.Start(startInfo)
            let res = AssignProcessToJobObject(CommonJob, proc.Handle)
            if (not res) then

And now it prints out the path:

C:\Users\solom\Documents\Projects\Prajna\src\Client\Client\bin\Releasex64>PrajnaClient.exe
---> "C:\Prajna\Job1082\PrajnaTest.CS\A17E5B37B00DA042\PrajnaClientExt_PrajnaTest.CS.exe.exe"

Then I look into that path, that file really doesn't exist!! only one config file and many referenced dll there. Any idea?

app.config is lost in remote

I tried to use app.config to configure some stuff of my application, which contains a configSection. But this doesn't work, and then I check the file in C:\Prajna\Job1082\MyApp\XXXX, the app.config is a totally new one, which just add some FSharp.Core binding. Is this a bug? or should this be fixed?

Better error msg for authentication failure

When the clients are deployed with a passwd, and when the app tries to send job request to such clients but does not supply the passwd or supply wrong passwd, the request will fail, but the exception msg is something like "Job fails as some source is not available". A better error message that accurately describes the reason is needed.

Default Cache Memory Limit

This issue was brought to my attention by Bruno.

One of the workload that he has written doesn't cache the data as desired. Investigation found out that the default memory size of the container is set as 1024MB. So when the data set becomes large, it is not cached, causing the performance degradation.

I would like to open this issue to document the behavior. The questions are:

  1. Should we raise default memory size?
  2. Should we give a warning when cache is not working (if we do, how? Should we throw an exception, print out a warning, etc).

daemon doesn't work in remote machine

Hi,

I did the following:

  1. I built Prajna with build.cmd R from the master branch source code;
  2. I copied the client folder to two machines
  3. I deleted the folder C:\Prajna on both machines
  4. I turned off the Windows Firewall completely on both machines
  5. I started client without any options (so it will work on default port 1082)

Then I simply want to call this from remote:

        private static void SayHello(Cluster cluster)
        {
            var dset = new DSet<int> { Name = Guid.NewGuid().ToString("D"), Cluster = cluster };
            var descriptions =
                dset
                .Distribute(Enumerable.Range(0, cluster.NumNodes))
                .Select(i =>
                {
                    var gpuId = Int32.Parse(ConfigurationManager.AppSettings["GpuId"]);
                    var machineName = System.Environment.MachineName;
                    var process = System.Diagnostics.Process.GetCurrentProcess();
                    var gpu = Gpu.Get(gpuId);
                    return $"Hello from {machineName} {gpu} taskId={i} processId={process.Id} threadId={Thread.CurrentThread.ManagedThreadId}";
                })
                .ToIEnumerable()
                .ToArray();
            foreach (var description in descriptions)
            {
                Console.WriteLine(description);
            }
        }

The test result is like this:

If I use the following cluster.lst, then it WORKS:

XiangCluster,1082
localhost,1082

Also, if I use real IP, it also works (I launch the application from the same machine):

XiangCluster,1082
192.168.1.110,1082

Then if I want to add a remote machine, like:

XiangCluster,1082
192.168.1.110,1082
192.168.1.108,1082

Then it DOESN'T WORK anymore.

I checked the log of daemon on 192.168.1.110, I found something like:

============== New Log File ======================= 
160222_020627.133310,1,Info,PrajnaMachineId is 290efbd143477d11
160222_020627.173490,1,Info,Initialize network stack with initial buffers: 128 max buffers: 33554 buffer size: 128000 network threads: 2
160222_020627.215722,1,Info,Start PrajnaClient at port 1082 (1100-1150)...................... Mode x64, 1 MB
160222_020627.218012,1,Info,Minimum threads: 16, Minimum I/O completion threads: 4
160222_020627.218622,1,Info,Maximum threads: 32767, Maximum I/O completion threads: 1000
160222_020627.219319,1,Info,Available threads: 32767, Available I/O completion threads: 1000
160222_020627.220786,1,Info,Start Parameters [||]
160222_020627.228628,1,Info,All command parsed ==== true
160222_020627.261606,1,Info,Authentication parameters: pwd=empty keyfile= keyfilepwd=empty
160222_020709.983452,18,Info,GetDriveSpace, fail to retrieve remote storage information for machine 192.168.1.108, with exception System.Management.ManagementException: Access denied 
   at System.Management.ManagementException.ThrowWithExtendedInfo(ManagementStatus errorCode)
   at System.Management.ManagementScope.InitializeGuts(Object o)
   at System.Management.ManagementScope.Initialize()
   at System.Management.ManagementObjectSearcher.Initialize()
   at System.Management.ManagementObjectSearcher.Get()
   at Prajna.Core.RemoteConfig.GetDriveSpace(String machineName)
160222_020744.693316,16,Error,Prajna.Core.Task.ErrorInSeparateApp : (Close,Job) Failed to find Job Action object for Job a6dfc439-1db5-41f5-9843-569a50737867, error has happened before? 

BTW, when I use the Prajna from the NuGet package, it works.

Enourmous memory usage on example with word counting

Hi!
I just tried to run your example from https://github.com/MSRCCS/Prajna/wiki/C%23-Examples#the-first-prajna-c-example-walk-through

My text file is about 10Mb. I not expected to see memory usage about 2Gb to calculate words in this file. Yea, there may be little overhead, but can you actually tell how i can tune memory usage?
I can see from one of the issues, that you use memory cache 1Gb per node, but i cannot see any possibility, how do i can configure cache size, or even maybe use my own cache (for example, out of application memory).
One more thing i can see, node count not correspond to cache usage (for 2 or 3 nodes, i have 2Gb, for 4 nodes - 3Gb). Seems to be, node allocate memory only when it use it, but it is not clear right now for me. Could you please explain?

Question about the order in DSet sequence

Hi, I am testing Prajna today, it is very cool. But I face a strange behavior. I don't know if it is by design which I mis-understand, or if it is a bug.

First, I have two machines, both run PrajnaClient. So my cluster is composed by two nodes.

I run the test from a machine called "KINGKONG", and I think it is faster because it is local. So, in the transform, I do some heavy job (like print trace log a lot) if I see this node is "KINGKONG", to slow down it. Then after that, if I call DSet.ToIEnumerable, I get a reversed order.

The test code is like:

        static void SequenceOrder(Cluster cluster)
        {
            var dset = new DSet<int> {Name = Guid.NewGuid().ToString("D"), Cluster = cluster};
            var inputs = Enumerable.Range(0, cluster.NumNodes).ToArray();
            var outputs =
                dset.Distribute(inputs)
                    .Select(x =>
                    {
                        var machineName = System.Environment.MachineName;
                        if (machineName == "KINGKONG")
                        {
                            for (var i = 0; i < 10000; ++i)
                            {
                                Trace.TraceInformation("{0}...", i);
                            }
                        }
                        return x;
                    }).ToIEnumerable().ToArray();

            Console.WriteLine("Partitions: {0}", dset.NumPartitions);

            for (var i = 0; i < inputs.Length; ++i)
            {
                Console.WriteLine("#.{0}: {1} {2}", i, inputs[i], outputs[i]);
            }
        }

And the output looks like:

Partitions: 2
#.0: 0 1
#.1: 1 0
Press any key to continue . . .

"Job fails as some source is not available" when invoking on cluster

I've already seen #60, but in this case no passwords are used. I'm trying to invoke one of the examples (Pi) on a cluster comprising of my local machine only. When running the example with new Cluster("local[2]") it all works fine, but as soon as I use a cluster.lst I get the "Job fails as some source is not available" error.
My cluster.lst has:
Test,7077
localhost
And the client is invoked with:
-port 7077 -jobports 8888-8890

Any idea what might be causing this?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.