Giter Club home page Giter Club logo

Comments (22)

bilderbuchi avatar bilderbuchi commented on August 17, 2024

+1, for what it's worth.
There were a couple threads about this in the support forum, but these seem to be gone now, only replaced by a contact form. As far as I can recall, the "conclusion" was that since you can't disambiguate by extension, it's too complicated and won't be done.

This is why the README of linguist gives me new hope. It says that they already use "deep content inspection" to correctly identify (extensionless) script files, and use it to correctly assign .h files to their respective languages, so I think it can be concluded that correctly assigning .m files is possible now.

The thing is, I don't know what/how to contribute to Linguist here - there's already a lexer entry for Matlab/origin (even with .m file extension), and it's listed in languages.yml (Although only with .matlab extension). I think in one of the threads in the support forum, there were already a couple of unique syntax characteristics for Matlab and Objective-C identified, but I have no way to check/dig that out (see above).

edit: An example repository is this one (mostly Matlab/Octave, and a bit of C++, no Obj-C): https://github.com/bilderbuchi/OpenTLD
Also, for what it's worth, I have never seen anyone use the .matlab extension. Even official code by The Mathworks (e.g. built-in functions) have a .m extension!

from linguist.

josh avatar josh commented on August 17, 2024

👍 We'd need to come up with a good heuristic to detect Matlab. Ideas?

from linguist.

bilderbuchi avatar bilderbuchi commented on August 17, 2024

I'm working on something right now. Will post a gist later.
Do you have the permissions to look in the old support forum? ("Matlab language" should be a sufficient search term) Or are those totally nuked?

from linguist.

josh avatar josh commented on August 17, 2024

@bilderbuchi I did a quick check. Only found reports that .m wasn't working for Matlab. Nothing useful.

from linguist.

bilderbuchi avatar bilderbuchi commented on August 17, 2024

OK, thanks.

So, I adapted from the .h recognition in https://github.com/github/linguist/blob/master/lib/linguist/blob_helper.rb#L276-289
Only in pseudocode unfortunately, I don't speak ruby (or Obj-C for that matter):
https://gist.github.com/1051201

Major points:

If Obj-C can be identified confidently, Matlab could be a fall-through/else option without needing Matlab heuristics.

Maybe, there's an already finished and working heuristic at http://cloc.sourceforge.net/
Update: The relevant code is in function matlab_or_objective_C in http://cloc.svn.sourceforge.net/viewvc/cloc/trunk/cloc?revision=234&view=markup L6352 ff.

Btw, it will be impossible to distinguish between octave and Matlab code, so I think they should consistently be lumped together ("Matlab/Octave")...

from linguist.

jovo avatar jovo commented on August 17, 2024

matlab code files are all either "functions" or "scripts".
functions must (i think) start like this:

function varout = fname(varin1,varin2)
or
function [varout1 varout2 ....] = fname(varin1,varin2,...)

they are permitted to have blank space or comments before this line,
so this won't work for all matlab code files, but it should work for a chunk of them.

moreover, the last word of a function should be end but that is not enforced.

from linguist.

bilderbuchi avatar bilderbuchi commented on August 17, 2024

yeah, I know, and that's what I had already written in the gist. Or do I miss something here?

from linguist.

josh avatar josh commented on August 17, 2024

Unfortunately the gist isn't an applyable patch with test cases.

from linguist.

bilderbuchi avatar bilderbuchi commented on August 17, 2024

Yeah, I know. :-(
That's why we'd need someone who speaks ruby, obj-c and matlab. Optimally perl, to correctly translate the heuristic of cloc I linked to.

from linguist.

josh avatar josh commented on August 17, 2024

If you guys could put together some solid test fixtures for both matlab and obj-c I could work on putting the implementation together.

from linguist.

bilderbuchi avatar bilderbuchi commented on August 17, 2024

Done. See #30

from linguist.

josh avatar josh commented on August 17, 2024

Basic support is in.

If you find any files that don't match, please send a pull request with a failing test case.

from linguist.

bilderbuchi avatar bilderbuchi commented on August 17, 2024

nice, thank you!
when will this go live/affect existing repository statistics?

from linguist.

audioplastic avatar audioplastic commented on August 17, 2024

Hi all. This works well for one of my repositories although it is still saying that there is a little objective-C in there. There is no objective-C in either example.

https://github.com/audioplastic/MAP/graphs/languages

... but is falls flat on its face for object oriented matlab code.

https://github.com/audioplastic/soma/graphs/languages

from linguist.

bilderbuchi avatar bilderbuchi commented on August 17, 2024

Hm, I just checked in my repo (https://github.com/bilderbuchi/OpenTLD/graphs/languages), and Matlab doesn't get recognized at all. I thought that maybe the graphs take some time to refresh, but that should have already happened by now I guess?

Anyway, possibly the Matlab comment recognition (5ecc442#L0R337) should be moved above the Obj-C to collect most Matlab files by the comments?

Also, I find it curious how it puts Obj-C vs Matlab at 91 vs 9% in your failing repo - this seems to imply one of eleven .m files is Matlab, but there are only 6 .m files in the whole repo (and onyl 10 non-image files), so I wonder where those numbers come from.

from linguist.

earl avatar earl commented on August 17, 2024

@bilderbuchi

In my OpenTLD repo [..] Matlab doesn't get recognized at all. I thought that maybe the graphs take some time to refresh [..]

Those statistics indeed take time to refresh. They'll get updated if you push some changes, though. So you'll either have to wait for it to be reindexed, or push some commits.

For reference, here's what the current master of linguist thinks about your OpenTLD repo:

71%  Matlab
27%  C++
2%   C
1%   Objective-C

I wonder where those numbers come from.

They are based on lines of code, not on the number of files.

from linguist.

audioplastic avatar audioplastic commented on August 17, 2024

We could probably solve the object oriented matlab problem by just adding a filter that looks for "classdef" on line 1. Right-clicking in the current folder window in matlab and adding a new file of type class gives you a file like the following

    classdef dsfssdf
        %DSFSSDF Summary of this class goes here
        %   Detailed explanation goes here

        properties
        end

        methods
        end

    end

@bilderbuchi
With regard to your comment about my repo, I'm guessing that the percentages are calculated based on the number of lines or characters rather than the number of files?

@earl
I have just done a push to test my failing repo with the latest linguist and I get the same numbers.

from linguist.

bilderbuchi avatar bilderbuchi commented on August 17, 2024

yes, lines of code would be obvious. sorry, i'm stupid. i should get more coffee :-P

@earl: theses figures for my repo look right, thanks for checking. the 1% obj-C is a misclassification afaik, but 1% error is more than OK!

@audioplastic: sounds good, could be a one-line change here if we lump class recognition together with function recognition.

from linguist.

audioplastic avatar audioplastic commented on August 17, 2024

@bilderbuchi
I'll have a shot at making the necessary modifications tonight and will open a pull request unless anyone beats me to it.

from linguist.

josh avatar josh commented on August 17, 2024

More matlab improvements are welcome. Just be sure to improve the tests to match whatever new heuristics you are adding.

The graphs are weighted by file size instead of loc. Its more convenient since that data is already cached. And sorry, no we can't switch it cause I'd need to reindex all the code on GitHub :)

https://github.com/github/linguist/blob/master/lib/linguist/repository.rb#L75

from linguist.

Air-Craft avatar Air-Craft commented on August 17, 2024

I've just tried to create an Objective-C gist and when it saves it reverts to Matlab. Editing has no effect.

from linguist.

whitten avatar whitten commented on August 17, 2024

If your code is Objective-C then you need to find out how the linguist
program to recognize you did NOT write Matlab code. Make an example of it
and put it into the
https://github.com/github/linguist/tree/master/samples/Objective-Cdirectory.
There are currently only 9 examples of Objective-C to contrast
with Matlab. Not much to tell them apart.

On Fri, Apr 11, 2014 at 7:30 AM, Hari Karam Singh
[email protected]:

I've just tried to create an Objective-C gist and when it saves it reverts
to Matlab. Editing has no effect.

Reply to this email directly or view it on GitHubhttps://github.com//issues/15#issuecomment-40194132
.

from linguist.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.