Yukun (Tifa) Tan's SPIN project.
Updated by ARFC.
Scrapes reactor coordinates from Wikipedia.
This is under a CC-BY license.
Run:
python scraping_wikidata.py
Output File:
coordinates.sqlite
scraping the web for international reactor data
License: Creative Commons Attribution 4.0 International
Hi Dr. @katyhuff ! As mentioned earlier today, seems like the code
CREATE TABLE testTable(
'index', 'Name' TEXT, 'Coord' TEXT, 'Long' REAL, 'Lat' REAL
)
'''
(lines 84-86 in https://github.com/ytan15/webscraping/blob/master/scraping_wikidata.py)
is not working. The error messages says "probably unsupported type".
Earlier it was
CREATE TABLE testTable(
'index', 'Name' OBJECT, 'Coord' OBJECT
)
'''
and it was working fine.
Can you please take a look at it? Thanks!
There should be a LICENSE file that covers the entire repository.
There is a subsection in the README.md file that claims the repository is under CC-BY, and I'm not sure if that's adequate.
First, a determination has to be made about the type of License under which to cover this repository.
Second, that License should then be added through a pull request to this repository.
The results of Webscrape are missing some reactors that are currently shutdown.
This is especially the case for reactors out of the United States
Next step, if you're getting frustrated with wikipedia: check out the PRIS database. If necessary, @jbae11 can help with understanding it and perhaps would be willing to show you what he's done so far.
Some, but not all, of the information we want, should be in that database.
Hi @ytan15 ! Sorry it took a while to describe this goal. Let's see what you can get into your sqlite3 file out of wikipedia alone. Here are some tutorials on scraping wikipedia and wikidata:
You ought to be able to find names and locations of reactors. You may also be able to find some of the other columns as well. Let us know what you can find!
This should be a mostly empty database.
ID, Name, Lat, Long, Institution, Country, Type, Fuel, Enrichment, Electrical Capacity, Thermal Capacity, Thermal Efficiency, Capacity Factor
Hi Dr. @katyhuff ! I've been looking into scraping from wikidata, and I think I've grabbed the gist of it. So I started to expand on scraping_wikidata.py, trying to find more information. However, I've encountered some confusions-
?reactors wdt:P17 ?country .
into the query. However, the problem with that is, not all nuclear reactors have a country attribute (because in wikidata "country" means "sovereign state of this item"). For example, Bhabha Atomic Research Centre (https://www.wikidata.org/wiki/Q854682) doesn't have "country" attribute, although from the description we know that it is based in India. Hence the issue is, if I add this line into the query, it will filter out this entry (Bhabha Atomic Research Centre). Is this something that we should be concerned about?
Although I haven't started on Wikipedia yet, I'm a bit concerned that would querying from wikidata and wikipedia possibly cause any overlap, since wikidata stores the data of wikipedia?
If Wikidata contains data of wikipedia, how come there is "Category: Nuclear power reactor types" in Wikipedia, but not Wikidata? Am I having some kind of misunderstanding?
Would you mind discussing these issues with me? Either on here or I could make an appointment with you if you feel like it would easier to talk in person.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.