Comments (38)
A modification of the project language part.
from cti-website-frontend.
Little progress on Database description document: https://drive.google.com/file/d/1nFra7PsgAiN56HO6IwJgYqUH1SaXIUr5/view?usp=sharing
from cti-website-frontend.
Hi @emecas I am using Lucidchart to create the ER diagram, here is the link: https://lucid.app/lucidchart/a8931e87-430a-49cb-b122-174448651333/view?page=0_0#?folder_id=home&browser=icon
If you need to edit the document, click "request edit access" on the top right corner~
from cti-website-frontend.
This is the current database architecture and may be modified in the future.
from cti-website-frontend.
@jsachsman and @ExperimentsInHonesty need a site architecture review with @ddkbky, @ashleyerffmeyer, @Abhishek-AC, @nrrao, and @codewilling to review updating how this database will be used and connected to our user flow. Possibly will try to get this in Sunday 9/6.
from cti-website-frontend.
Name of Project
URL of Repository
Existing Topic Tags
New Topic Tags
Process type: [manual, automated, none]
from cti-website-frontend.
project link should be changed to project URL and add link
We need to add to the database model diagram which fields can have multiple values or just one value.
from cti-website-frontend.
Some changes:
- Add a ‘user’ entity table.(Is it proper to set the ‘password’ to 16 characters?)
- Add ‘creation_time’ and ‘creation_user_id’ column for ‘project_tag’ table to record the tags’ creation history. When each project creates a new tag, a new record will be created, no matter whether the tag is the same as the other project's tag.
- Modify ‘project_link’ to ‘project_url’ in ‘project’ table. Add a column named ‘repository_url’.
- Add ‘process_type’ to ‘project’ table. [manual, automated, none] (Can we change this column to a Boolean type? Automated: True/ False/ Null)
from cti-website-frontend.
Some SQL statements can help us lookup tags, for example,
New topic tags:
SELECT DSTINCT tag_id
FROM project_tag
WHERE creation_time > '2020-08-01';
Trending topics:
SELECT COUNT(p.tag_id), t.name
FROM project_tag AS p
JOIN tag AS t
ON p.tag_id = t.tag_id
WHERE p.creation_time > '2020-08-01' AND t.tag_type = 'topic'
ORDER BY COUNT(tag_id) DESC;
from cti-website-frontend.
@ddkbky please change contributor throughout to organization
please remove the following columns in the user table: password, email
in user table please change organization_id to organization_ids and make it an array of integers rather than a single integer
thanks!
from cti-website-frontend.
@ddkbky also please add a new table to represent search results on tags named search_information
from cti-website-frontend.
@jsachsman If you need to map many users to many organizations, wouldn't it be better to create a mapping table rather than dealing with an array of project ids? If this data model is going to be managed by Django, the mapping table will be created automatically if you use a ManyToMany relation between users and organizations.
from cti-website-frontend.
- Remove “password”, “email” in “user” table, and add a column named “head_portrait_url”.
- Add a table “user_organization” to connnect the many to many relationship between users and organizations.
- I was busy this week and haven’t been able to add a table to store the search information, sorry.
from cti-website-frontend.
- Add a table named "search_information"
- Remove "head_potrait_url" column in "user" table
from cti-website-frontend.
- No changes to the database schema.
- Have some ideas to the “trending topics” display bar.
- Sometimes the contents of search may not be structured. For example, people input "Pollution in Europe". It includes two keywords "pollution" and "europe". So, if we look up the most keywords by an SQL statement, we may ignore several different keywords that appear in a phrase or short sentence.
- In that case, we can use Python text mining tools to analyze the search contents. It can count the number of times keywords appear in an entire document, and has the function of analyzing part of speech of words.
- I am preparing a database description document to explain the meaning of the various tables and columns.
from cti-website-frontend.
Database description document: https://drive.google.com/file/d/1nFra7PsgAiN56HO6IwJgYqUH1SaXIUr5/view?usp=sharing
The basic framework has been established,and filled in some contents.
from cti-website-frontend.
@ddkbky can we add a column to the organization table for github_id ? Would help filling out GH info for orgs and retrieving it. Thanks!
from cti-website-frontend.
Have added "github_id" to the "organiztion" table.
from cti-website-frontend.
Hi @ddkbky , what are you using to generate the ER diagram?
from cti-website-frontend.
Fix a bug: Delete "language_id" in "project" table.
from cti-website-frontend.
Database description document is almost done: https://drive.google.com/file/d/1nFra7PsgAiN56HO6IwJgYqUH1SaXIUr5/view?usp=sharing
from cti-website-frontend.
@shinjonathan @emecas will you have time to review the database schema and description doc before this Thursday?
We would like to take a tour through it and discuss how these tables are going to be used at that time.
from cti-website-frontend.
Crossing Schema #62 with Requirements gathering for DB across site #118
ID | Screen | Table(s) | Notes |
---|---|---|---|
1 | Landing Page | ? | Q#1 what is the table and column to storage the submitted email address ? |
2 | Home Page MVP | No DB required for MVP | |
3 | Home Page POST-MVP | search_information | OK - it seems fine to save the words used by people on the DB |
4 | About Page | No DB required | |
5 | Radical Collaboration / FAQ Page #1 | ? | is a table the destination for new questions (FAQ) from people ? |
6 | Radical Collaboration / FAQ Page #2 | ? | what is the table recording the number of clicks on each question |
7 | Directory Page | TBD | we need answer to these questions ? - what users search for, - what they click on/select, - what 8 8 |
9 | Search Page | TBD | Main Features: - when somebody type a search we need to record it and count the words used to use that in the trending topics calculation/presentation - then if they click refine result we are going to track it as well, like: - how they search - what features of the refining they use more |
10 | Donate Page | No DB required | |
11 | Social Media Page | No DB required | |
12 | TAG Generator Wizard Page | project | |
13 | TAG Generator Wizard Page | organization | |
14 | TAG Generator Wizard Page | language | |
15 | TAG Generator Wizard Page | project_language | |
16 | TAG Generator Wizard Page | project_tag | |
17 | TAG Generator Wizard Page | programing_language | |
18 | TAG Generator Wizard Page | project_pl |
@ExperimentsInHonesty
NOTE: why does 9 (Search Page) seem like overlapping 3 (Home Page POST-MVP) ?
from cti-website-frontend.
Some comments/questions on the data model:
- What is tag_type? How would a user enter this value?
- In tag table, name could be "value"?
- In organization table, instead of github_link a repository_url?
- In project table, add "status", website, slack_channel?
from cti-website-frontend.
@ddkbky could you and @shinjonathan review @giosce 's comments above for implementation?
from cti-website-frontend.
I see user_id in search_information table. Is that any user that search for projects or organizations (does it mean this person needs to be logged in)?
We can key off off the GitHub handle or an email address.
If we have user_id I suggest to use it in all the tables where a user enters data (organizations, tags, projects and relationship tables). Agreed. Yes. We'll need to check if we can find repository owners with the GitHub api.
Perhaps locations-projects and locations-organizations can be many-to-many (a project can be used in multiple locations). And locations can be worth to be a table with city, state, country.
Locations is difficult to source - we might be able to determine the project if tagged appropriately. (We can create hunches depending on the language. Unless we tag this, i'd recommend remove this. @chalimar let me know where we might want to include this.
I'd have also a user-project relationship (many to many). let's figure out how these users are keyed first. The screens point to an organization/project relationship at the moment (the user doesn't matter to from a CTI perspective)
In tag table, name could be "value"
name, value or tag seems fine to me.
In organization table, instead of github_link a repository_url?
we should add this attribute to the organization table
I'd have also a user-project relationship (many to many).
Storing the org to project relationship would probably fit our usecase a little better (it will also be easier to collect)
What is tag_type? How would a user enter this value?
Name it as a simple type. I would imagine these fields to be specified as ORGANIZATION, LOCATION, USER_GENERATED (that is with no additional user input being defined)
In project table, add "status", website, slack_channel?
Let's include these (status would probably need to be a few additional values too.
from cti-website-frontend.
Additional Tables:
from cti-website-frontend.
@giosce asked: If we have user_id I suggest to use it in all the tables where a user enters data (organizations, tags, projects and relationship tables).
@shinjonathan said: Agreed. Yes. We'll need to check if we can find repository owners with the GitHub api.
@ExperimentsInHonesty The GitHub api does not allow you to discover the GithHub repo Admin or org owner, we will have to ask them directly in our workflow if we want that.
from cti-website-frontend.
The definition of user from this projects perspective is a visitor to our site, who might use the wizard, or in another way provide data, or one who does not provide any data and uses the site passively. We may need to rename those roles
visitor and user where visitor is using site without providing detail and user is providing info while interacting with tools on the site.
from cti-website-frontend.
@ddkbky will talk directly with @cnk time to be arranged by bo slacking cynthia
from cti-website-frontend.
@giosce will post revised questions.
from cti-website-frontend.
Some comments/questions on the data model:
- What is tag_type? How would a user enter this value?
tag_type is divided into organization tag and topic tag, the types are not entered by the users, but automatically recognized by the background. (How to identify tag types is a problem. We need to discuss it with Cynthia more. )
- In tag table, name could be "value"?
All right, since my mother language is not English, I welcome everyone to point out my grammatical mistakes.
- In organization table, instead of github_link a repository_url?
Yes, which I meant is exactly github repository url.
- In project table, add "status", website, slack_channel?
What the meaning of "status"?
The field "project_url" refers to projects' websites url. Do you think I should change the way I express it?
I think we don't need to know projects' slack channel.
@giosce Welcome more discussion!
from cti-website-frontend.
Another question from Gio:
If we have user_id I suggest to use it in all the tables where a user enters data (organizations, tags, projects and relationship tables).
Good question. Since we get the projects information form out apis, so users don't need to enter information about projects. We also have collected most of the organizations' information, so, in most cases, users also don't need to enter organization information. In some cases, if our database do not include some orgianizations, users can enter the organization information. Yes , maybe we can add a column "user_id" in "organizaiton" table to store the user ids who add new organizations.
from cti-website-frontend.
With @ddkbky and @emecas we are reviewing and updating the ER diagram and database tables and fields in https://docs.google.com/document/d/1nFra7PsgAiN56HO6IwJgYqUH1SaXIUr5/edit
It's better to refer to that document. Maybe we should load it in github so we version it.
from cti-website-frontend.
There are still many open questions. The ER diagram at https://docs.google.com/document/d/1nFra7PsgAiN56HO6IwJgYqUH1SaXIUr5/edit needs to be confirmed and questions clarified:
What is user table?
What is user_id?
Which page uses the project table?
How is user_organization table used?
Shouldn't search_information table have a counter and not having user_id?
Emerson saw the following tables in the current database:
data_link: what's this for? I see also a link API, but not sure of the purpose
data_organization: = organization (I don't think we need to prefix "data" to every table)
data_notificationsubscription
data_faq: Should add to ER
from cti-website-frontend.
I generate a sharable link of database schema in Lucidchart, all collaborators can edit by this link: https://lucid.app/lucidchart/invitations/accept/84645e05-399e-41a8-bb11-4e24abe77e0a . If it still doesn't work, let me know. @giosce
from cti-website-frontend.
Emerson saw the following tables in the current database:
data_link: what's this for? I see also a link API, but not sure of the purpose
data_organization: = organization
I think the data_organization table you see in the Django database corresponds to the Organization table in the ER diagram EXCEPT that the github_link is in the data_link table - with type "GitHub". Some projects also have website, twitter, facebook, or meetup links. It would be possible to model those as columns within the organization table but that isn't what I chose to do.
(I don't think we need to prefix "data" to every table)
The data_ prefix comes from the django app where we are storing the data. I just let Django autogenerate the table names from the models. It is possible to set the table names explicitly if that bothers people but it will involves dumping and restoring the database with the new names.
from cti-website-frontend.
Do we still need to keep this issue open or have we finalized the schema and can close this?
from cti-website-frontend.
Related Issues (20)
- 'Civic Tech Organizations' Page Audit - Web
- 'Collaborate with us' Page Audit - Web
- 'Donate' Page Audit - Web
- 'Share the CTI' Page Audit - Web
- Civic Tech Organization Search - Open to lowest level when only 1 Organization shows HOT 1
- Github Rate Limitations
- Fix "removed in Django41Warning" HOT 3
- Most Recently Updated on Organizations Page
- Store state changes for searches HOT 2
- Org list box size responsiveness HOT 2
- "has github" checkbox on Organizations page HOT 1
- Re-parent orgs to siblings of Code for All
- Update link for CTI assets HOT 1
- Add Link type for commonly used social media
- Fix or replace use-query-params for org search HOT 1
- Back button incorrectly redirecting HOT 2
- Civic Tech Organizations Page currently includes "%20" in query when back button is used HOT 1
- Update NavBar lower padding HOT 1
- Org Icons on site, but not on local
- Update FAQ: How do I add my organization to Civic Tech Index?
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from cti-website-frontend.