mediumroast / mediumroast_js Goto Github PK
View Code? Open in Web Editor NEWMediumroast for GitHub CLI and API/SDK
Home Page: https://www.mediumroast.io/product.html
License: Apache License 2.0
Mediumroast for GitHub CLI and API/SDK
Home Page: https://www.mediumroast.io/product.html
License: Apache License 2.0
For operations that require modifications of objects like Create, Delete and Update early warning needs to be provided to the user so they know the container is locked and being used. Ideally this information should include the GitHub login name and function that has the container in question locked.
env.js
When creating non-public companies the initial set of the name does not take and instead reverts to the default of Unknown
. This results secondary impacts, notably the helper URLs for Google maps, Google patents and Google news are also set to Unknown
.
Perform extra steps to make key parts of the API doc work better.
When using the interaction add_wizard there is no link created for a company, but the interaction is linked to the company.
Since it is possible to ingest 1:N files in a directory the revised approach is to remove the feature of specifying a single file and instead only allow for a directory to be specified.
Calling the logo_server should enable the storage of an icon from a company's website. The results aren't coming back correctly. Confirmation should be made that the data is returning correctly and if so then we need to understand why the data isn't coming back in the right format.
In the setup CLI linking to the default study should be eliminated. We will pursue more complete study implementation in the Alpha 3 project.
? Your company's name is? ProductPlan
? What type of company is this? Private
Set the company's type to [Private]
? What role should we assign to this company? Competitor
Set the company's role to [Competitor]
? What region is this company in? North, Meso and South America (AMER)
Set the company's region to [AMER]
Attempting to automatically discover company firmographics.
? No company matching [ProductPlan] found, try again? Yes
? Your company's name is? ProductPlan
? No company matching [ProductPlan] found, try again? No
No matching company found, starting manual company definition...
? What's the description? ProductPlan is a platform for product managers to align their teams, collaborate on their roadmaps, and measure the
impact of their work.
? What's the website? https://www.productplan.com/
? What's the street address? 836 Anacapa Street Suite 944
? What's the city? Santa Barbara
? What's the state or province? CA
? What's the country? USA
? What's the zip or postal code? 93101
? What's the phone number? 805-618-2975
? What's the wikipedia url? Unknown
? What's your industry search string? computer
? Please choose the most appropriate industry Computers and Computer Peripheral Equipment and Software
file:///usr/local/lib/node_modules/mediumroast_js/src/cli/companyWizard.js:576
const [status, msg, [lat, long]] = await this.getLatLong(fullAddress)
^
TypeError: object null is not iterable (cannot read property Symbol(Symbol.iterator))
at AddCompany.doAutomatic (file:///usr/local/lib/node_modules/mediumroast_js/src/cli/companyWizard.js:576:33)
at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
at async AddCompany.wizard (file:///usr/local/lib/node_modules/mediumroast_js/src/cli/companyWizard.js:755:21)
at async file:///usr/local/lib/node_modules/mediumroast_js/cli/mrcli-company.js:230:19
The default naming of the application should follow the following pattern <company_name>_discovery
to signify that this is a repository for discovery artifacts.
This was encountered when trying to create Airtable
who has a wikipedia page, but is not a public company. This means the code path should be evaluated to see what is going on.
{
"code": 200,
"message": "Only Wikipedia data has been detected for the company [Formagrid, Inc.].",
"module": "Query-> merge_data",
"data": {
"description": "Airtable is a cloud collaboration service headquartered in San Francisco. It was founded in 2012 by Howie Liu, Andrew Ofstad, and Emmett Nicholas. Airtable is a spreadsheet-database hybrid, with the features of a database but applied to a spreadsheet. The fields in an Airtable table are similar to cells in a spreadsheet, but have types such as 'checkbox', 'phone number', and 'drop-down list', and can reference file attachments like images. Users can create a database, set up column types, add records, link tables to one another, collaborate, sort records and publish views to external websites.",
"wikipediaURL": "https://en.wikipedia.org/wiki/Airtable",
"type": "Private Company",
"industry": [
"Unknown"
],
"name": "Formagrid, Inc.",
"country": "Unknown",
"city": "San Francisco, California, US",
"website": [
"https://airtable.com/"
],
"isin": "Unknown",
"cik": "Unknown",
"exchanges": [
"Unknown"
],
"longitude": -122.41966,
"latitude": 37.77712,
"address": "Unknown",
"googleMaps": "https://www.google.com/maps/place/San%20Francisco%2C%20California%2C%20US%20Unknown",
"googleNews": "https://news.google.com/search?q=Formagrid%2C%20Inc.",
"googlePatents": "https://patents.google.com/?assignee=Formagrid%2C%20Inc."
},
"dependencies": {
"modules": {
"edgar": "https://github.com/miha42-github/company_dns",
"wikipedia": "https://github.com/miha42-github/company_dns",
"wptools": "https://pypi.org/project/wptools/",
"geopy": "https://pypi.org/project/geopy/"
},
"data": {
"sicData": "https://github.com/miha42-github/sic4-list",
"oshaSICQuery": "https://www.osha.gov/data/sic-search",
"wikiData": "https://www.wikidata.org/wiki/Wikidata:Data_access"
}
}
}
The following behaviors have been observed on MacOS and Linux (Ubuntu 22). Checking on WinOS has yet to be performed at this time.
Unable to run mrcli setup:
Error messages states: "Cannot find package 'json2csv'". Even after installing json2csv, the error persists.
Discovered on Ubuntu 22.04.3 LTS.
john@medici:~$ npm --version
10.2.4
john@medici:~$ node -v
v21.4.0
john@medici:~$ mrcli setup
node:internal/modules/esm/resolve:853
throw new ERR_MODULE_NOT_FOUND(packageName, fileURLToPath(base), null);
^
Error [ERR_MODULE_NOT_FOUND]: Cannot find package 'json2csv' imported from /usr/local/lib/node_modules/mediumroast_js/src/cli/output.js
at packageResolve (node:internal/modules/esm/resolve:853:9)
at moduleResolve (node:internal/modules/esm/resolve:910:20)
at defaultResolve (node:internal/modules/esm/resolve:1130:11)
at ModuleLoader.defaultResolve (node:internal/modules/esm/loader:396:12)
at ModuleLoader.resolve (node:internal/modules/esm/loader:365:25)
at ModuleLoader.getModuleJob (node:internal/modules/esm/loader:240:38)
at ModuleWrap. (node:internal/modules/esm/module_job:85:39)
at link (node:internal/modules/esm/module_job:84:36) {
code: 'ERR_MODULE_NOT_FOUND'
}
Node.js v21.4.0
To ensure that we aren't unintentionally uploading duplicates a hash should be created that checks to see if an existing interaction exists by the same file_hash. There are multiple reasons for the this and the most important is that there is no need to cause the creation of duplicates. Users should strive to reuse existing content and be prevented during ingest from adding duplicated interactions.
file_hash
was created to store a SHA265 hashfindByHash
that is specifically for InteractionsThe @category and @subcategory options in source might be helpful to better organize the documentation.
As we only want to have unique companies a check is required to verify that users do not enter the same name of the company again thus creating a duplicate.
Attributes for each object must be able to be updated by the user via the CLI.
While this basic implementation already exists in github.js
it is not implemented in gitHubServer.js
nor reflected or implemented in the CLI. Implementation of this feature must be done for the implementation of the interactions wizard as linking to companies mandates company object updates.
Both general and specific implementation notes are provided to consider how to carefully and correctly implement the this feature. As the implementation proceeds additional notes for each object type will be added.
In the package JSON add links for GitHub, Homepage and possibly API doc, so that users are able to reach these locations.
While improvements have been made to the company reporting some of the linking and usability needs to be updated. Additionally, there are some generic improvements that should be considered for the CLI to enable a better overall experience.
Add the following a per company compared to:
Bridging from the user functions in GitHub.js implement mirroring CLI operations starting with reads/gets. This information is needed for interactions creation, and in the future will be required for the purposes of synchronizing permissions, user info and access controls to the backend.
At this time the following steps are recommended for operationalizing the user CLI
For the initial release the CLI for studies should be disabled.
An improvement to the current CLI wizard for interactions need to be implemented. Essentially the following steps should be pursued:
Steps for the automated code path include:
Much of the logic primitives are already available to do this.
rpm i -g mediumroast_js
mrcli setup
and follow all required stepsPerform the test with a public company like IBM or Hitachi
Perform the test with a non-public company having a wikipedia page like Cerebras
Perform the test with a non-public company having no wikipedia page like Mediumroast, Inc.
The company Id used for the comparison property appears to not match the actual company Id. This needs to be remedied for the purposes of client correctness. Otherwise components, clients and using this data will present the incorrect data to the user.
Given the following company Ids in the system:
When these ids are found in the comparison section for uMETHOD
the key for the object is incorrect and doesn't match the system. There could be many reasons for this, but possible this is due to the system not having persistent identifies upon an object restore. It means that when a backup and a restore is done the object ids can change, and it is therefore unsafe to rely on the keys in the comparison property. A better approach would be to lookup the company id by name and then obtain the id. Since the name is the only consistent property across backups and restores this will be the preferred implementation and clients should focus on this behavior.
{
'2': {
name: 'Savonix',
similarity: 0.7193503975868225,
role: 'Competitor',
most_similar: { name: 'Science - Savonix', score: 0.8715741634368896 },
least_similar: {
name: 'Bayer Selects Savonix Digital Cognitive Assessment Platform to Validate the Effects of Multivitamin Supplement Berocca in Malaysia | Business Wire',
score: 0.7110357284545898
}
},
'3': {
name: 'PrecivityAD',
similarity: 0.758750319480896,
role: 'Competitor',
most_similar: {
name: 'C₂N Data Release for New Blood Test Combining p-tau217 Ratio with Amyloid beta 42:40 — PrecivityAD™',
score: 0.8071120381355286
},
least_similar: {
name: 'ApoE_Genotyping_Physician_FAQ',
score: 0.5041195154190063
}
},
'4': {
name: 'Neurotrack Technologies, Inc.',
similarity: 0.6818479299545288,
role: 'Competitor',
most_similar: {
name: 'Life Insurance Industry Invests In Cognitive Health To Tackle The Future Of Aging',
score: 0.8111998438835144
},
least_similar: {
name: 'Lets Talk Neurotransmitters - Neurotrack',
score: 0.5499856472015381
}
}
}
In the CLI first lookup the company by name and then the interaction by name as well. This way there is no ambiguity.
The present implementation of chart.js
produces images from static data. Therefore the aim will be to connect chart creation to real data sources served from the backend. In some cases the backend may need to be updated to include new data sources for a complete implementation.
There are two major steps needed to completely make charts live in the dashboard, both are explained below.
chart.js
to be a module rather than a standalone appechart_server
and run on the same server as miniodashboard.js
to call chart.js
with the appropriate data sources as they are availableechart_server
Ultimately, the model for quality that is surfaced for the radar chart is simpler than what was thought up here. A more generic model and a mapping of types of content to the generic model was created. This could suggest that a drill-down report into what's included in the general model is warranted. More thought is required. The items below have been updated to accommodate the more generic approach and a parking lot matter has been added to the alpha project.
Refactor wizard into smaller more maintainable modules
Currently the mr_setup utility creates the right config file, but cli.js won't read that setting. This needs to be fixed.
With the emphasis on moving the basic backend to GitHub interactions must be updated to account for the change. Updates will cover interaction object creation and upload, CLI cosmetic and presentation changes, module documentation changes, and so on. For this phase only creation and unfiltered object gets will be pursued.
Necessary steps to make the CLI operational.
Implement what it will take to create one or more interactions.
Things that are going to just improve the CLI associated modules, etc.
The installation instructions should report to the user the URL of the GitHub application such that they are able to have it installed in their organization.
mrcli company --output=csv
If ~/Documents is missing, mrcli will not generate a warning or error yet will return 0 as if nothing is wrong.
Create a restore function for the purposes of capturing objects and enabling the restoration process.
Break things up into correct and smaller modules to enable better maintainability.
With new data available for the comparison checking in caffeine some reporting is needed to enable the user to see the results.
Add the following a per company compared to:
I believe that there is a potential bug in node.js filesystem module. This file is needed because otherwise the actual final of the two images that needs to be inserted into the docx file won't load. If we create a scratch file then it will. Essentially something is off with the last file created in a series of files, but the second to last file appears ok.
Obviously more testing is needed before we approach the node team with something half baked. Until then here are some observations:
In at least the CLI report, there are references to study, company and interaction objects. To ensure that there's coupling between the front end and at least the CLI generated report active links to these objects should be created. Additionally, there are links in the document to various locations these need to be verified and tested.
Implementation assumes that the web_ui has links for these objects available and they are operable. Additionally, as the report logic is moved out of the CLI only and into the web_ui, the report should carry the same features there too.
There are links to reports and to/from various sections. This needs to be verified and tested.
To help with consistent readability create common output formatting for the CLI related to items and sub items. The image below provides an example for formatting.
Notable ideas:
While the code operates it is not as clean and maintainable as it should be. There are functions that are in the wrong module, implementation of return codes and methodologies is variable, and so on. The aim of this issue is to capture various improvements and cleanup the code such that someone else can more easily consume it.
[<boolean_success>, {status_code: <code>, status_msg: <msg>}, <result_object>]
. While in other cases the approach is inconsistent. The return structures should be made consistent as much as possible. The general rule is that anything returned to a caller from outside of the module should receive the structure listed above.github.js
should have basic functions like readObjects
, writeObjects
, and so on. This would leave more complex functions like catch
, create
and release
for githubServer.js
. Further in some cases the implementation of these more complex functions is in the wizard instead of one of these two modules. A good example of this is interactionWizard.js
that implements a bespoke create function for an interaction due to the fact that an interaction consists of both a JSON object and a file system object. This should be relooked at to see if everything could be moved into githubServer.js
or not. Perhaps the best example of this are the delete and update functions that are well structured, but wrongly placed.github.js
the updateObject
function should be simplified and follow the deleteObject
approach. In github.js
the createObject
function should be simplified to follow deleteObject
. In both cases the implementation should move to githubServer.js
.Add functionality to mr_interaction
to ingest contents from a directory. This assumes that all contents are associated to a specific company. Additionally, an option for creating a single interaction from a single file in the file system will be supported.
mr_setup
The following tests must be performed to declare this issue completed and associated branch closed.
mrcli c --add_wizard
file:///usr/local/lib/node_modules/mediumroast_js/src/cli/companyWizard.js:46
this.env.DEFAULT.company_dns ? this.companyDNS = this.env.DEFAULT.company_dns : this.companyDNS = companyDNSUrl
^
TypeError: Cannot read properties of undefined (reading 'company_dns')
at new AddCompany (file:///usr/local/lib/node_modules/mediumroast_js/src/cli/companyWizard.js:46:26)
at file:///usr/local/lib/node_modules/mediumroast_js/cli/mrcli-company.js:232:23
Node.js v21.4.0
The backup process works for mr_backup, but not the restore
Add developer documentation and create the appropriate GitHub pages configuration for the developer doc. This will be done with JSDocs. The prototype has already been developed, but need to add:
The documentation for the CLI is incomplete and should be finished.
The application on GitHub has been updated with a new description and appropriate URLs. It is not yet listed on the market place. That will be a separate step.
The output of the authorization artifacts is challenging to read and handle. Therefore improvements need to be made to ensure users are able to more easily pick the URL and the code out from the command line.
If your platform supports this, opening your browser. Otherwise, navigate to the URL below in your browser.
Authorization URL: https://github.com/login/device
Type or paste the code below into your browser and follow the prompts to authorize this client.
Device authorization code: CE44-9B64
This is both hard to parse and read. Ideally the improvements should make it clear with coloring what needs to be copied and used from the CLI to achieve the intended outcome of authorization.
Improve the output logic to put the key items into a table and make the message a little easier to understand. The new output format is provided below in a screenshot.
Add the GitHub user details to the object definition like GitHub login, etc.
For completeness object deletion needs to be built in. Given the state of Alpha_2 deletion will focus on Interaction and Company objects only. Deletion of Study objects is out of scope for Alpha_2 and deletion of User objects is completely out of scope since that is the domain of the GitHub admin.
The following steps are captured based upon the implementation notes to help drive towards conclusion.
This will cover the simple use cases for Interactions and Companies plus prepare the way for Studies in the future.
--delete
switch should take an object nameAn implementation specific to Companies in gitHubServer.js
.
mrcli-company.js
we need to convert it to a similar structure as mrcli-user.js
to add specific command line switches. Specifically a new option called --allow_orphans
which will enable simple deletion will be created.--allow_orphans
command line switch.--delete
switch should take an object nameThe following are needed to completely remove an interaction.
removeBlob
function to completely remove the interaction contents.github.js
while a specific version for Companies will be implemented in gitHubServer.js
.Implement find_by_x and find_by_name for the general case to ensure the user can filter outputs with specific focus on what they are looking to see. Note that find_by_name is just a special implementation in the API for find_by_x where the attribute is name
and the value is the name of the object in question.
This is now ready for testing.
Presently the table in the main company's section doesn't match the similarity chart. This should be aligned.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.