Giter Club home page Giter Club logo

Comments (20)

lithiumtoast avatar lithiumtoast commented on August 23, 2024 1

Problem was:

typedef int PNSC_FNSPEECH_DATA (const unsigned char *pData, unsigned int cbDataSize, PNSC_SOUND_DATA pSoundData, void *pAppInstanceData);

This is the first time I have encountered a function proto inlined as a typedef in this fashion. Usually I encounter function protos in this fashion:

typedef SDL_HitTestResult (SDLCALL *SDL_HitTest)(SDL_Window *win, const SDL_Point *area, void *data);

Anyways just a small couple line fix in logic: d1c74f0

Note that the API in question looks to be quite intensive on strings. Please advise that casting CString8U to string calls Runtime.CString, to which that function hashes the string if the pointer representing the string (often the case of pointer to first element) is not in the cache. While djb2 is extremely fast, this is technically a O(n) where n is the length of the string; I have not found a more elegant way to deal with strings. Ideas and feedback is much appreciated.

from c2cs.

lithiumtoast avatar lithiumtoast commented on August 23, 2024 1

Why not use marshaling that comes with C#?

Last time I looked into it it it was implementation details: each C string (char*) when converted to a C# string (string) would be a new allocation on the GC heap which is not ideal for my use cases. I do it manually so I can have the opportunity to cache the strings using a hashing algorithm or do something else entirely in the future. It also ensures that the bindings are exactly 1-1 in terms of bit layout.

There is a whole GitHub issue on .NET about this problem: dotnet/corefxlab#2350

from c2cs.

lithiumtoast avatar lithiumtoast commented on August 23, 2024 1

Would you mind telling how you discovered what went wrong.

I just run the program with the debugger enabled and observe the stack trace. I then unravel the stack and observe the state of variables and other memory which cause the program to enter the particular state. The only magical pieces are libclang and Rosyln; the rest is 100% just ordinary C# code. However libclang can be a little hard to debug sometimes and I find myself entering throw away code to get the name of a type / cursor in some cases to help with debugging.

The tests are mostly done via smoke tests which I use the bindings generated by C2CS in my own libraries. I know immediately if something doesn't work right because I use the bindings all the time.

from c2cs.

lithiumtoast avatar lithiumtoast commented on August 23, 2024 1

Consider the case where a C library uses C strings (char*) for identifiers that are most likely less than 25 characters in length. E.g. "PlayerEntity", "WeaponComponent", "RenderingSystem", etc. These strings are used frequently and usually out of context to the global C functions as part of the API to do something significant such as getting an entity or updating a component, etc, and then returning a callback. I do not want the garbage collector in C# to allocate memory in heap 0 for these strings from the callback. Instead I rather run a quick hash over the C string and use a hashtable to know if that C string's contents has already has an analogous memory allocated in the GC and use that reference instead.

System.String.Intern(string str) would work if it accepted a raw pointer to a C style string. Otherwise, it accepts a string as input but that string instance still has to be allocated first which is not great. Also, according to the docs the memory would be retained until the application exits or restarts which is not great either.

from c2cs.

lithiumtoast avatar lithiumtoast commented on August 23, 2024 1

a special mode

Sure, it could be an option that starts at the cs program level:

from c2cs.

lithiumtoast avatar lithiumtoast commented on August 23, 2024

Do you have a header / repo I could use to reproduce?

from c2cs.

lithiumtoast avatar lithiumtoast commented on August 23, 2024

@roozbehid-ic But yes, there is support for wchar_t. It's likely you have just hit a corner case that wasn't accounted for.

from c2cs.

roozbehid-ic avatar roozbehid-ic commented on August 23, 2024

This is the header

#include <stddef.h>
#define WINAPI
#define NSCUBE_EXPORTS
#define NSCUBE_API __declspec(dllexport)


typedef unsigned long	nscCHANID;
typedef void *			nscHANDLE;
typedef void *			nscHSRV;
typedef int	nscRESULT;

NSCUBE_API nscRESULT WINAPI nscAddTextW
(
	nscHANDLE hTTS,
	const wchar_t *pwszInputTxt,
	void *pUserText
);

and how I run it and what I get

C2CS.exe ast -i nscube.h -o nscube.json
AbstractSyntaxTreeC: Started.
        Started step (1/3) 'Parse C code from disk'
libclang: Parsing 'nscube.h' with the following arguments...
        --language=c --std=c11 -Wno-pragma-once-outside-header -fno-blocks -m64 --include-directory= -isystemC:\Program Files (x86)\Windows Kits\10\Include\10.0.17763.0\ucrt -isystemC:\Program Files (x86)\Microsoft Visual Studio\2019\Enterprise\VC\Tools\MSVC\14.29.30133\include
        Finished step (1/3) 'Parse C code from disk' in 278.1678 ms
        Started step (2/3) 'Extract C abstract syntax tree'
AbstractSyntaxTreeC: Finished unsuccessfully. Review the following diagnostics for reason(s) why:
PANIC: [Panic] String cannot be of zero length. (Parameter 'oldValue')
   at System.String.Replace(String oldValue, String newValue)
   at C2CS.UseCases.AbstractSyntaxTreeC.ClangTranslationUnitExplorer.Location(CXCursor cursor, Nullable`1 type)
   at C2CS.UseCases.AbstractSyntaxTreeC.ClangTranslationUnitExplorer.VisitTranslationUnit(CXTranslationUnit translationUnit)
   at C2CS.UseCases.AbstractSyntaxTreeC.ClangTranslationUnitExplorer.AbstractSyntaxTree(CXTranslationUnit translationUnit, Int32 bitness)
   at C2CS.UseCases.AbstractSyntaxTreeC.UseCase.Explore(CXTranslationUnit translationUnit, ImmutableArray`1 includeDirectories, ImmutableArray`1 ignoredFiles, ImmutableArray`1 opaqueTypes, ImmutableArray`1 whitelistFunctionNames, Int32 bitness)
   at C2CS.UseCases.AbstractSyntaxTreeC.UseCase.Execute(Request request, Response response)
   at C2CS.UseCase`2.Execute(TRequest request)

from c2cs.

lithiumtoast avatar lithiumtoast commented on August 23, 2024

Issue was the directory was not specified for input and output. I always used it like this before:

C2CS.exe ast -i .\nscube.h -o .\nscube.json

I fixed it where if the directory is not specified then the current directory would be used.

Let me know if you have any other issues or things are still persisting.

from c2cs.

roozbehid-ic avatar roozbehid-ic commented on August 23, 2024

Thanks I got past that one but getting a new error.

c:\Utils\C2CS.exe  ast -i .\nscube.h -o .\nscube.json
AbstractSyntaxTreeC: Started.
        Started step (1/3) 'Parse C code from disk'
libclang: Parsing '.\nscube.h' with the following arguments...
        --language=c --std=c11 -Wno-pragma-once-outside-header -fno-blocks -m64 --include-directory=. -isystemC:\Program Files (x86)\Windows Kits\10\Include\10.0.17763.0\ucrt -isystemC:\Program Files (x86)\Microsoft Visual Studio\2019\Enterprise\VC\Tools\MSVC\14.29.30133\include
        Finished step (1/3) 'Parse C code from disk' in 236.3814 ms
        Started step (2/3) 'Extract C abstract syntax tree'
AbstractSyntaxTreeC: Finished unsuccessfully. Review the following diagnostics for reason(s) why:
PANIC: [Panic] Type name was not found for 'FunctionPointer'.
   at C2CS.UseCases.AbstractSyntaxTreeC.ClangTranslationUnitExplorer.TypeName(String parentTypeName, CKind kind, CXType type, CXCursor cursor)
   at C2CS.UseCases.AbstractSyntaxTreeC.ClangTranslationUnitExplorer.CreateFunctionPointer(String typeName, CXCursor cursor, Node parentNode, CXType originalType, CXType type, ClangLocation location)
   at C2CS.UseCases.AbstractSyntaxTreeC.ClangTranslationUnitExplorer.ExploreFunctionPointer(String typeName, CXCursor cursor, CXType type, CXType originalType, ClangLocation location, Node parentNode)
   at C2CS.UseCases.AbstractSyntaxTreeC.ClangTranslationUnitExplorer.ExploreTypedef(Node node, Node parentNode)
   at C2CS.UseCases.AbstractSyntaxTreeC.ClangTranslationUnitExplorer.ExploreNode(Node node)
   at C2CS.UseCases.AbstractSyntaxTreeC.ClangTranslationUnitExplorer.Explore()
   at C2CS.UseCases.AbstractSyntaxTreeC.ClangTranslationUnitExplorer.AbstractSyntaxTree(CXTranslationUnit translationUnit, Int32 bitness)
   at C2CS.UseCases.AbstractSyntaxTreeC.UseCase.Explore(CXTranslationUnit translationUnit, ImmutableArray`1 includeDirectories, ImmutableArray`1 ignoredFiles, ImmutableArray`1 opaqueTypes, ImmutableArray`1 whitelistFunctionNames, Int32 bitness)
   at C2CS.UseCases.AbstractSyntaxTreeC.UseCase.Execute(Request request, Response response)
   at C2CS.UseCase`2.Execute(TRequest request)

You can see the header file here.
https://gist.github.com/roozbehid-ic/03197d9db0735454ef893132c5b30dd6

from c2cs.

roozbehid-ic avatar roozbehid-ic commented on August 23, 2024

Thank you so much for all the help.
Two things:

Would you mind telling how you discovered what went wrong. At least it helps future people to know how to diagnose the problems.
Also it seems you are using special classes for unicode, etc. Why not use marshaling that comes with C#? It seems you are targeting .net 5 then it should also have utf8, utf16 and all those marshaling and looks more native to c#.

from c2cs.

roozbehid-ic avatar roozbehid-ic commented on August 23, 2024

Makes sense.
I use following for char* which is potentially utf8 to string (although C# comes with ascii char* to string marshaller)

    public class UTF8Marshaler : ICustomMarshaler
    {
        static UTF8Marshaler static_instance;

        public IntPtr MarshalManagedToNative(object managedObj)
        {
            if (managedObj == null)
                return IntPtr.Zero;
            if (!(managedObj is string))
                throw new MarshalDirectiveException(
                       "UTF8Marshaler must be used on a string.");

            // not null terminated
            byte[] strbuf = Encoding.UTF8.GetBytes((string)managedObj);
            IntPtr buffer = Marshal.AllocHGlobal(strbuf.Length + 1);
            Marshal.Copy(strbuf, 0, buffer, strbuf.Length);

            // write the terminating null
            Marshal.WriteByte(buffer + strbuf.Length, 0);
            return buffer;
        }

        public unsafe object MarshalNativeToManaged(IntPtr pNativeData)
        {
            byte* walk = (byte*)pNativeData;

            // find the end of the string
            while (*walk != 0)
            {
                walk++;
            }
            int length = (int)(walk - (byte*)pNativeData);

            // should not be null terminated
            byte[] strbuf = new byte[length];
            // skip the trailing null
            Marshal.Copy((IntPtr)pNativeData, strbuf, 0, length);
            string data = Encoding.UTF8.GetString(strbuf);
            return data;
        }

        public void CleanUpNativeData(IntPtr pNativeData)
        {
            Marshal.FreeHGlobal(pNativeData);
        }

        public void CleanUpManagedData(object managedObj)
        {
        }

        public int GetNativeDataSize()
        {
            return -1;
        }

        public static ICustomMarshaler GetInstance(string cookie)
        {
            if (static_instance == null)
            {
                return static_instance = new UTF8Marshaler();
            }
            return static_instance;
        }
    }

Which probably ends up being similar to what you do.
But on delegates and dllimport it looks more natural and you end up using string class.

        public delegate int ActionReply(Int64 Id, [MarshalAs(UnmanagedType.CustomMarshaler, MarshalTypeRef = typeof(UTF8Marshaler))] string nameValueParameters);

or

        [DllImport("mycode.dll",  EntryPoint = "channelIssueActionRequest")]
        public static extern int ChannelIssueActionRequest(Int64 Id,int requestId, [MarshalAs(UnmanagedType.LPStr)] string actionTypeName, [MarshalAs(UnmanagedType.CustomMarshaler, MarshalTypeRef=typeof(UTF8Marshaler))] string nameValueParameters);

in This one actionTypeName is from ascii char* to string, and nameValueParameters is from char* utf8 to string
So technically in this approach you always have string in C# which might not be desirable for many cases as you lose original data. But also totally makes sense for many other cases.

from c2cs.

roozbehid-ic avatar roozbehid-ic commented on August 23, 2024

I am wondering why are you caching strings?
Could this be used instead https://docs.microsoft.com/en-us/dotnet/api/system.string.intern?view=net-5.0 ?

from c2cs.

lithiumtoast avatar lithiumtoast commented on August 23, 2024

I use following for char* which is potentially utf8 to string (although C# comes with ascii char* to string marshaller)

This is interesting, I'll have to try it out sometime and see if it makes more sense. Right now implicit casting makes CString8U and CString16U "behave" and be "accepted" into functions and methods which use string. I do the same technique for CBool and bool.

from c2cs.

roozbehid-ic avatar roozbehid-ic commented on August 23, 2024

Also FYI, I pasted the code to show custom marshallers could potentially be used (although I don't think they can be used in unmanaged function pointers)
New C# already has [MarshalAs(UnmanagedType.LPUTF8Str)] which can convert char* as utf8 to string

What do you think about a special mode of c2cs program that would generate managed C# code and also uses these marshaling?

from c2cs.

roozbehid-ic avatar roozbehid-ic commented on August 23, 2024

I can give it a try and see if I can send a PR for you. Thanks

from c2cs.

lithiumtoast avatar lithiumtoast commented on August 23, 2024

Just a note of warning, I have StyleCop ruleset turned on for this project. You can disable it by adding the following to the .csproj in question:

<EnableAnalyzers>false</EnableAnalyzers>

from c2cs.

lithiumtoast avatar lithiumtoast commented on August 23, 2024

Closing as the original problem was resolved

from c2cs.

lithiumtoast avatar lithiumtoast commented on August 23, 2024

@roozbehid-ic I have just encountered a grave mistake on my part. wchar_t is 2 bytes on Windows but actually 4 bytes on Linux. I'll need to re-think this.

from c2cs.

lithiumtoast avatar lithiumtoast commented on August 23, 2024

I'm going to close this actually a create a new issue and link the related issues.

from c2cs.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.