Giter Club home page Giter Club logo

Comments (7)

mihaibudiu avatar mihaibudiu commented on September 1, 2024 1

The diagram here: https://github.com/vmware/hillview/blob/master/docs/userManual.md#11-system-architecture shows the system architecture. Only workers read data. If you read data from files, the files should be on the same machines where the workers reside. On a single machine you can load files from the same machine. On a cluster it is easiest to divide the data among the machines where the workers are, placing all files that should be analyzed together in the same directory. If you have a more concrete use case we can discuss about it specifically.

from hillview.

mihaibudiu avatar mihaibudiu commented on September 1, 2024 1

But ideally you should not need to move any of your data when using Hillview. If you already have the data stored in a distributed system, e.g. a set of logs on some machines, the ideal case is to deploy a Hillview worker on each machine which stores some of the data. Many data lakes look like this.

from hillview.

pradeepgaur avatar pradeepgaur commented on September 1, 2024

Is there a way to simply browse data? I get the attached view which needs individual column double clicking to load data. I think on the hosted demo "flights csv" dataset, few days back I was able to just browse data.

image

from hillview.

mihaibudiu avatar mihaibudiu commented on September 1, 2024

I think that you have loaded the data alright. The issue is that your table has lots of columns, and thus it starts in a "schema" view instead of a "Table" view. In the schema view you are shown all columns and you can choose which ones to see in a table view. So you have selected 9 columns and displayed these as a table. If you want to see a table with all columns, just select all (using click on the first row, and then shift-click on the last row) and use the menu "view selected columns." I could also add menu buttons "select all columns", or "view all columns as table" to make this easier.

from hillview.

mihaibudiu avatar mihaibudiu commented on September 1, 2024

The reason we show a schema view for wide tables is that they do not really fit nicely on the screen being very wide, so we give you the option to select only a subset of the columns.

from hillview.

pradeepgaur avatar pradeepgaur commented on September 1, 2024

Thanks for your prompt responses. I wanted to know following.

  1. "View Selected Columns" gives me some kind of aggregated view, so each row can be a an aggregation of multiple. I just want to take a look at raw data without aggregation.
  2. "Demo Dataset" main menu item - can I add my own menu item to load a frequently used dataset? how?

I really appreciate your responses, and I see good potential of Hillview on my project.

from hillview.

mihaibudiu avatar mihaibudiu commented on September 1, 2024

Hillview will always aggregate the displayed data in some form, because most data does not fit on the screen.
To see data in all columns you can select all columns and then click "view/show". But even then the view you will see will be aggregated. There is no easy way to see the rows of the original file in the order they are in the file. This is because we assume the data is split between multiple files and there is no clear ordering of the files. We could add an option to also have a column which is the line number, and then you could sort on that column.

For 2. the only solution right now is to edit the code; this is in file loadView.ts.

But this is a good idea: to give you the possibility of creating JSON file with a set of files to load. I will file a separate issue for that. If you look at the code in loadView.ts, it looks like this:

testitems.push(
            { text: "Flights (15 columns, CSV)",
                action: () => {
                    const files: FileSetDescription = {
                        fileNamePattern: "data/ontime/????_*.csv*",
                        schemaFile: "short.schema",
                        schema: null,
                        headerRow: true,
                        name: "Flights (15 columns)",
                        fileKind: "csv",
                    };
                    this.init.loadFiles(files, this.page);
                },
                help: "The US flights dataset.",
            },
...

So this is in fact just a JSON object. We could read this object from a file. But I will need to document the schema of the JSON.

from hillview.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.