Giter Club home page Giter Club logo

Comments (8)

TommasoBelluzzo avatar TommasoBelluzzo commented on August 25, 2024

Dear @DF-18, after carefully looking at your dataset I can absolutely suggest to review your input file and try to make it as compliant as possible with my examples. A few things I noticed:

  • As you mentioned, empty cells are not handled; replace all of them with zeros.
  • Only defaulted (all zeros at the end of any time series) and insolvent firms (all negative values at the end of equity time series) are handled. I never found a good approach for handling companies unlisted or non-existant at the beginning of the panel (for the moment, all zeros can do the trick or you can also repeat the first value up to the beginning of the panel). Any suggestion is more than welcome.
  • The first column of sheets must be called "Date", not "statadate" or "timeqt".
  • Your shares/capitalizations dates seem to adhere to the format "04-Jan-02", but you are calling the "parse_dataset" function providing a completely different format: "dd/mm/yyyy". The same goes for balance data sheets, which seem to be in format "2002q1" but you are using the "parse_dataset" function using "QQ yyyy".
  • The original language locale of the sheet seems to be chinese, which is a broadly recognized source of problems in this project.

This package isn't just about dropping a random dataset into a folder, pushing F5 and waiting for MATLAB to display the results after 12h of computation. A minimum effort is required in order to provide clean, consistent and properly formatted data. Nothing is undocumented, at least on the point of view of inputs and outputs... and this is a good starting point. If not enough, at the bottom of the readme file, on the main page, there are a bunch of guidelines to make the dataset parsing process work.

Unfortunately, I must carry on my life and my work and the number of support requests has dramatically increased over the past years. As stated in the readme, I cannot provide direct support for this kind of issues anymore, but I hope my tips can help you out.

from systemicrisk.

DF-18 avatar DF-18 commented on August 25, 2024

It's realy kind of u offering those suggestions. I'm glad that I can discuss with u.

After modifying the format issues like "Date", "yyyy/mm/dd", "QQ yyyy" I've tried to check my dataset about other potential errors.

First, I replaced the some cells in second column of sheet "Shares" of "Example_Large.xlsx" with other closing prices of market index, and the error reported was still "The 'Shares' sheet contains invalid column types.".

Then, I deleted "Example_Large.mat" to test the "parse_dataset" funtion, since the mat file is the output of the "parse_dataset" funtion.

Now something interesting happened, the error reported was still the same!

...
The 'Shares' sheet contains invalid column types.
...

The language of My Win10 system and excel 2016 is English. Do u know why this thing happened? Thanks a lot.

from systemicrisk.

TommasoBelluzzo avatar TommasoBelluzzo commented on August 25, 2024

You didn't specify a very important thing. What is your MATLAB version?
Anyway, it seems like you are using a pre 9.1 version. Start debugging the function "ensure_field_consistency" in "parse_dataset" to see what's going on.

from systemicrisk.

DF-18 avatar DF-18 commented on August 25, 2024

I‘ve installed MATLAB Version: 9.1.0.441655 (R2016b), and tried the run.m again. Now the "parse_dataset" funtion works fine for the "Example_Large.xlsx" after "Example_Large.mat" being deleted.

When I set 'CrossSectional' ENABLED true, ANALYZE true, and COMPARE true, some errors reported as below:

Error using cellfun
Input #2 expected to be a cell array, was string instead.

Error in safe_plot>safe_plot_internal (line 50)
r = cellfun(@(x)[' ' x],r,'UniformOutput',false);

Error in safe_plot (line 18)
safe_plot_internal(ipr.handle);

Error in run_cross_sectional>analyze_result (line 322)
safe_plot(@(id)plot_correlations(ds,id));

Error in run_cross_sectional>run_cross_sectional_internal (line 164)
analyze_result(ds);

Error in run_cross_sectional (line 51)
[result,stopped] =
run_cross_sectional_internal(ds,sn,temp,out,k,d,car,sf,fr,analyze);

Error in
run>@(temp,out,analyze)run_cross_sectional(ds,sn,temp,out,0.95,0.40,0.08,0.40,3,analyze)

Error in run (line 121)
[result,stopped] = run_function(temp,out,analyze);

In safe_plot.m, the relevant codes about line 50 are as below:

function safe_plot_internal(handle)

persistent ids;

name = func2str(handle);
name = regexprep(name,'^@\([^)]*\)','');
name = regexprep(name,'\([^)]*\)$','');

try
    id = [upper(name) '-' upper(char(java.util.UUID.randomUUID()))];
catch
    id = randi([0 10000000]);
    
    while (ismember(id,ids))
        id = randi([0 100000]);
    end
    
    ids = [ids; id];
    id = [upper(name) '-' sprintf('%08s',num2str(id))];
end

try
    handle(id);
catch e
    delete(findobj('Type','Figure','Tag',id));
    
    r = getReport(e,'Extended','Hyperlinks','off');
    r = split(r,newline());
    r = cellfun(@(x)['  ' x],r,'UniformOutput',false);  %%% line 50 is here
    r = strrep(strjoin(r,newline()),filesep(),[filesep() filesep()]);

    warning('MATLAB:SystemicRisk',['The following exception occurred in the plotting function ''' name ''':' newline() r]);
end

end

I'm new to MATLAB and it's hard for me to understand these codes completely in a short time. Is there any idea to this issue? Thanks a lot.

from systemicrisk.

TommasoBelluzzo avatar TommasoBelluzzo commented on August 25, 2024

I opened a new issue for this problem, which was off topic with respect to the current issue.

Unfortunately, I don’t have MATLAB 2016 and I cannot install it for the moment. It might take some time for me to have the tools needed to debug this error. You might attempt to set a few breakpoints in that function to see what happens.

from systemicrisk.

TommasoBelluzzo avatar TommasoBelluzzo commented on August 25, 2024

Please, try to reprocess your dataset with the new release.

from systemicrisk.

DF-18 avatar DF-18 commented on August 25, 2024

I ran the Cross Section part of the new release, and it works fine. Thanks for ur efforts to fix bugs.

As to ajust to be suitable for unbalanced panel data, I replaced the blank cells with the latest value in Excel. So when importing dataset into MATLAB, there is no blank cell anymore. After the calculation of MATLAB, it is necessary to replace the cells, which is blank for a particular firm and a particular date, with blank, since the value is missing in original data because the firm has not been listed at that moment. In a word, for codes running without errors, I input fake data for those firm including missing values, and replace these calculation results with blank after calculation.

I’m not good at programming so this way sounds clumsy, although it could get me what I want.

from systemicrisk.

TommasoBelluzzo avatar TommasoBelluzzo commented on August 25, 2024

Unfortunately, it’s not easy to deal with this kind of situations. If you don’t want to remove those time series, your approach can somehow overcome this limit.

from systemicrisk.

Related Issues (15)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.