Github Repo: String10/ChinaOpenDataPortal-Frontend-Vue, based on Vue3 and Tabler: An HTML Dashboard UI Kit built on Bootstrap.
Deploy the frontend using nginx inside of a docker container.
Expose: Port 80. Env-Vars:
VITE_BACKEND_HOST
.
Image build command:
# PWD: ./ChinaOpenDataPortal-Frontend-Vue
docker build -f docker/Dockerfile -t username/imagename .
Push image to Docker Hub:
docker push username/imagename:latest
Use this command to create a basic
env.custom.sh
:echo "PYTHON_PATH=$(realpath $(which python))" >> ./scripts/env.custom.sh
Github Repo: cqsss/ChinaOpenDataPortal, based on Sprint Boot and Thymeleaf.
Provide basic Web Page service and ability of acting as API server for other frontend service.
Default index path is indices/current
.
Use scripts/start-server.sh
to start a process as API server and backend.
Append more arguments as you need like ./scripts/start-server.sh --server.port=9998
.
Environmet variables you may want to specify in env.custom.sh
:
- MAVEN_PATH
- JAVA_PATH (Java 11 Recommended)
- ADMIN_USER
- ADMIN_PSWD
Github Repos:
- String10/ChinaOpenDataPortal-Metadata: Python scripts crawling metadata from each portals, multi-threads supported. Crawled metadata will be written to database for next step usage.
- String10/ChinaOpenDataPortal-IndexBuilder
Use scripts/fetch-data.sh
to start a process for metadata fetching.
PS: There is one table that serves as an archive table for each metadata crawl and another table for index building.
Environmet variables you may want to specify in env.custom.sh
:
- About Crawler Control:
- CRAWL_WORKERS
- CRAWL_FILES (whether or not to download datafiles)
- About database (Only MySQL Supported):
- DB_ADDR
- DB_PORT
- DB_USER
- DB_PSWD
- DATABASE_NAME
- REF_TABLE_NAME (specify a table as template)
- PRD_TABLE_NAME (specift a table which used in production)
- Others:
- PYTHON_PATH (Python 3.6 Recommended)
After writing into database, the index builder will be started.
If necessary, it will link the latest index to the path indices/current
.
If current index has been updated,the server will receive a POST request and refresh index path.
Environmet variables you may want to specify in env.custom.sh
:
- BACKEND_URL
+-------------------+--------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-------------------+--------------+------+-----+---------+----------------+
| dataset_id | int | NO | PRI | NULL | auto_increment |
| title | varchar(255) | YES | | NULL | |
| description | text | YES | | NULL | |
| tags | text | YES | | NULL | |
| department | varchar(255) | YES | | NULL | |
| category | varchar(255) | YES | | NULL | |
| publish_time | varchar(255) | YES | | NULL | |
| update_time | varchar(255) | YES | | NULL | |
| is_open | varchar(255) | YES | | NULL | |
| data_volume | varchar(255) | YES | | NULL | |
| industry | varchar(255) | YES | | NULL | |
| update_frequency | varchar(255) | YES | | NULL | |
| telephone | varchar(255) | YES | | NULL | |
| email | varchar(255) | YES | | NULL | |
| data_formats | varchar(255) | YES | | NULL | |
| url | text | YES | | NULL | |
| province | varchar(255) | YES | | NULL | |
| city | varchar(255) | YES | | NULL | |
| standard_industry | varchar(255) | YES | | NULL | |
+-------------------+--------------+------+-----+---------+----------------+
+-------------------+--------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-------------------+--------------+------+-----+---------+----------------+
| dataset_id | int | NO | PRI | NULL | auto_increment |
| title | varchar(255) | YES | | NULL | |
| description | text | YES | | NULL | |
| tags | text | YES | | NULL | |
| department | varchar(255) | YES | | NULL | |
| category | varchar(255) | YES | | NULL | |
| publish_time | varchar(255) | YES | | NULL | |
| update_time | varchar(255) | YES | | NULL | |
| is_open | varchar(255) | YES | | NULL | |
| data_volume | varchar(255) | YES | | NULL | |
| industry | varchar(255) | YES | | NULL | |
| update_frequency | varchar(255) | YES | | NULL | |
| telephone | varchar(255) | YES | | NULL | |
| email | varchar(255) | YES | | NULL | |
| data_formats | varchar(255) | YES | | NULL | |
| url | text | NO | | NULL | |
| province | varchar(255) | YES | | NULL | |
| city | varchar(255) | YES | | NULL | |
| standard_industry | varchar(255) | YES | | NULL | |
| url_hash | varchar(255) | NO | UNI | NULL | |
+-------------------+--------------+------+-----+---------+----------------+
Recommended:
nohup bash ./scripts/start-server.sh >> ./logs/server.txt 2>&1 &
Recommended:
# the first of every month at 0:00
echo "0 0 1 */1 * __ROOT=\"`realpath .`\"; bash \${__ROOT}/scripts/fetch-data.sh > \"\${__ROOT}/logs/fd-\`date --i\`.txt\" 2>&1" >> logs/auto-task.txt
crontab logs/auto-task.txt